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The major envelope glycoprotein (gp!20> of HIV-1 has been the object of intensive 
investigation since the initial identification of H1V-1 as the etiological agent of AIDS 
(Barre-Sinoussi et aL, 1983). The gp120 molecule is of interest as a vaccine candidate 
(Berman et aL, 1 988; Arthur et al., 1987), as the mediator of viral attachment via the virus 
5 receptor CD4 (Dalgleish et aL, 1984; Klatzman et al., 1984) and the spread of the virus by 
cell-to-cell fusion (syncytia formation), and as an agent with immunosuppressive effects of 
its own (Shalaby et aL, 1 987; Diamond et aL, 1 988). It is also a potential mediator of the 
pathogenesis of HIV-1 in AIDS (Siliciano et aL, 1988; Sodroski et aL, 1986) and has been 
suggested to be the viral protein most accessible to immune attack. 
1 0 Currently, gpl 20 is considered to be the best candidate for a subunit vaccine, because: 

(i) gpl 20 is known to possess the CD4 binding domain by which HIV attaches to its target 
cells, (ii) HIV infectivity can be neutralized in vitro by antibodies to gp 120, (iii) the majority 
of the in vitro neutralizing activity present in the serum of HIV infected individuals can be 
removed with a gp120 affinity column, and <iv) the gp120/gp41 complex appears to b 

15 essential for the transmission of HIV by cell-to-cell fusion. See, e.g. Hu et aL, Nature 
328:721-724 (1987) (vaccinia virus-HIV £ny recombinant vaccine); Arthur et aL, J. ViroL 
63(12): 5046-5053 (1989) (purified gp120); and Berman et aL, Proc. NatL Acad. Sci. USA 
85:5200-5204 (1988) (recombinant envelope glycoprotein gp120). 

The gpl 20 molecule is synthesized as part of a membrane-bound glycoprotein, gpl 60 

20 (Allan et a/., 1985). Via a host-cell mediated process, gpl 60 is cleaved to form gp120 and 
the integral membrane protein gp41 (Robey et aL, 1985). Together gpl 20 and gp41 form 
the spikes observed on the surface of newly released HIV-1 virions (Gelderblom et aL, 1 987). 
As there is no covalent attachment between gpl 20 and gp41, free gpl 20 is released from 
the surface of virions and infected 'cells (Gelderblom et aL, 1985). 

25 The 9P120 molecule consists of a polypeptide core of 60,000 daltons; extensive 

modification by N-linked glycosylation increases the apparent molecular weight of the 
molecule to 120,000 (Lasky era/.. Science, 233:209-212 (1986)). The amino acid sequence 
of gpl 20 contains five relatively conserved domains interspersed with five hypervariable 
domains (Modrow et aL, J. Virology 61(2):570 (1987); Willey et aL, Proc. NatL Acad. ScL 

30 USA 83:5038-5042 (1986)). The hypervariable domains contain extensive amino acid 
substitutions, insertions and deletions. Sequence variations in these domains result in up to 
25% overall sequence variability between gpl 20 molecules from the various viral isolates. 
Despite this variation, several structural and functional elements of gpl 20 are highly 
conserved. Among these are the ability of gpl 20 to bind to the viral receptor CD4, the ability 

35 of gpl 20 to interact with gp41 to induce fusion of the viral and host cell membranes, the 
positions of the 1 8 cysteine residues in the gpl 20 primary sequence, and the positions of 1 3 
of the approximately 24 N-linked glycosylation sites in the gpl 20 sequence. 
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HIV ENVELOPE POLYPEPTIDES 

Field of the Invention 

This invention is concerned with antigens of the HIV virus, and to novel physiologically 
active polypeptides found in the HIV env glycoprotein. 
5 Background of the Invention 

Acquired immunodeficiency syndrome (AIDS) is caused by a retrovirus identified as the 
human immunodeficiency virus (HIV). A number of immunologic abnormalities have been 
described in AIDS including abnormalities in B-cell function, abnormal antibody response, 
defective monocyte cell function, impaired cytokine production, depressed natural killer and 
10 cytotoxic cell function, and defective ability of lymphocytes to recognize and respond to 
soluble antigens. Other immunologic abnormalities associated with AIDS have been reported. 
Among the more important immunologic defects in patients with AIDS is the depletion of the 
T4 helper/inducer lymphocyte population. 

In spite of the profound immunodeficiency observed in AIDS, the mechanism(s) 
15 responsible for immunodeficiency are not clearly understood. Several postulates exist. One 
accepted view is that defects in immune responsiveness are due to selective infection of 
helper T cells by HIV resulting in impairment of helper T-cell function and eventual depletion 
of cells necessary for a normal immune response. In vitro and in vivo studies showed that 
HIV can also infect monocytes which are known to play an essential role as accessory cells 
20 in the immune response. HIV may also result in immunodeficiency by interfering with normal 
cytokine production in an infected cell resulting in secondary immunodeficiency as for 
example, IL-1 and IL-2 deficiency. An additional means of HIV-induced immunodeficiency 
consists of the production of factors which are capable of suppressing the immune response. 
None of these models resolves the question of whether a component of HIV per se, rather 
25 than infection by replicative virus, is responsible for the immunologic abnormalities associated 
with AIDS. 

The HIV env protein has been extensively described, and the amino acid and RNA 
sequences encoding HIV env from a number of HIV strains are known (Modrow, S. era/., J. 
Virology 61 (2): 570 (1987). The HIV virion is covered by a membrane or envelope derived 

30 from the outer membrane of host cells. The membrane contains a population of envelope 
glycoproteins (gp 160) anchored in the membrane bilayer at their carboxyl terminal region. 
Each glycoprotein contains two segments. The N-terminal segment, called gp120 by virtue 
of its relative molecular weight of about 120kD, protrudes into the aqueous environment 
surrounding the virion. The C-terminai segment, called gp41, spans the membrane. gp120 

35 and gp 41 are linked by a peptide bond that is particularly susceptible to proteolytic cleavage, 
see e.g. McCune et a/., EPO Application No. 0 335 635, priority 28 March 88 and references 
cited therein. 
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Many workers in the field have prepared mutagenic and fragment variants of gpl 20. 
See, e.g.: Matsushita et aL, J. Virology 62:2107-2114 (1988); Rusche et al., Proc. Natl. 
Acad. Sci. USA 85:3198-3202 (1988); Goudsmit et aL, AIDS 2: 1 57-1 64 (1 988);- Javaherian 
et aL, Proc. Natl. Acad. Set. USA 86:6768-6772 (1989); Lasky et aL, Cell 50:975-985 
5 (1987); Kowalski et aL, Science 237:1351-1355 (1987); Willey et aL, Proc. Natl. Acad. Sci. 
USA 83;5038-5042 (1986); Modrow et aL, J. Virology 61:570-578 (1987). 

The disulfide bonding pattern within gpl 20 and the positions of actual oligosaccharide 
moieties on the molecule would be useful information for directing mutagenesis and 
fragmentation studies aimed at defining the functional domains of gp120 and sites for 
10 potential pharmacological interruption of its functions (e.g., type-common neutralizing 
epitopes). This information has been difficult to obtain due to the small amounts of gpl 20 
available from natural sources, the complexity of the disulfide bonding and oligosaccharide 
structures in gpl 20, and uncertainty regarding the functionality or structural relevance (Moore 
et al. t in press) of rgp120 produced in non-mammalian systems. 
1 5 The inventors herein have surprisingly discovered that certain regions of native gpl 20 

exist in specific three-dimensional conformation, which conformation is conserved over 
isotype and strain. 

It is an object of this invention to provide novel polypeptides which are useful as 
diagnostic tools for assaying biological samples for evidence of HIV infection. 
20 It »s a further object of this invention to provide novel polypeptides which are usable 

for vaccines, and for pharmacologic interruption of the course of HIV infection. 

It is a further object of this invention to provide methods for preparing such 
polypeptides, and antibodies directed to such polypeptides. 

Other objects, features, and characteristics of the present invention will become 
25 apparent upon consideration of the following description and the appended claims. 
Summary of the Invention 

Tn © objects of this invention are accomplished by the preparation and administration 
of compositions comprising isolated cyclized polypeptides which are suitable for 
administration to a human or non-human patient having or at risk of having HIV infection. 
30 These cyclized polypeptides are selected from the following: 

a) CVKLTPLCCNTSVITQAC [SEQ. ID NO. 1 ] and containing less than 
about 28 amino acid residues; 

b) PIHYCAPAG FAILKCNNKTFNGTGPCTNVSTVQCTHG 
I R P (SEQ. ID NO. 2) and containing less than about 45 amino acid residues; 

35 c) CNNKTFNGTGPC [SEQ. ID NO. 31 and containing less than about 22 

amino acid residues; 

d) CAPAGFAILKCCTNVSTVQC [SEQ. ID NO. 4) and containing less 
than about 30 amino acid residues; 
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e) PIH YCCTH6IRP [SEQ. ID NO. 5] and containing less than about 22 
amino acid residues; 

f) ggdpeivthsfnc'ggeffycnslpcrikqfinmwqevg 

KAMYAPPISGQIRCSSNITG [SEQ. ID. NO. 6] and containing less 
than about 65 amino acid residues; 

g) CGGEFFYCCRIKQFINMW Q E VGKAMYAPPISGQIRC 
[SEQ. ID NO. 7] and containing less than about 45 amino acid residues; 

h) CASDAKAYDTEVHNVWATHAC [SEQ. ID NO. 8] and containing 
less than about 30 amino acid residues; and 

«> TTTLFCA.SDAKAYDTEVHNVWATHACVPTDPN [SEQ. ID 

NO. 9) and containing less than about 50 amino acid residues. 
Additionally, this invention is also directed to compositions comprising an isolated 
polypeptide having an antigenic determinant or determinants immunologically cross-reactive 
with a determinant of the HIV env polypeptide of strain HTLV-IIIB having an amino acid 
1 5 sequence selected from the group consisting of 

a) residues T-80; 

b) residues 8-1 80; 

c) residues 165-260; 

d) residues 160-260; 

20 e) residues 260-310; and 

f) residues 320-479. 

This invention is particularly directed to vaccines comprising the compositions of this 
invention. The compositions of this invention, including variant analogues thereof, are also 
useful in diagnostic assays for HIV neutralizing antibody in patient samples. 

25 Monoclonal antibodies directed to the isolated polypeptides of this invention are 

provided, characterized by their affinity for ligand, epitope binding, and ability to a) block 
CD4/gpl 20 binding, b) neutralize HIV virions, c) reduce reverse transcriptase activity in vitro, 
and d) inhibit syncytia formation. 

These antibodies are useful as diagnostics for the presence of HIV infection in a patient 

30 or patient sample, and for affinity purification of HIV £ny. These antibodies are also useful 
in passively immunizing patients infected with HIV. In certain embodiments, antibodies are 
provided which are conjugated to a cytotpxin, a water-insoluble matrix, or to a detectable 
marker. 

Antibodies directed to HIV em/ epitopes have been described in the literature; however, 
35 it should be noted that, due to the variety and confusion among authors currently as to 
numbering systems for HIV env sequences, not all antibodies described in the literature as 
directed to certain regions will actually the same residue numbers as defined herein (see e.g. 
Matsushita era/., J. ViroL 62:2107-21 1 4 {1 988); EPO Application No. EP 339 504; Rusche 
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et al. t Proc. Natl. Acad. Sci. USA, 85:3198-3202 (1988); Looney et aL, Science 241 :357- 
359 (1988); 

Brief Description of the Drawings 

FIGURE 1 provides the amino acid sequences of (a) the mature envelope glycoprotein 
<gp120) from the III B isolate of HIV-1 [SEQ. ID NO. 10], and (b) the N-termina! sequence 
portion of the recombinant fusion glycoproteins (9AA [SEQ. ID NO. 1 1 ] or CL44 [SEQ. ID NO. 
1 2]) from the herpes simplex gD1 . Fusion sites between the gDI and gp1 20 segments in the 
9AA and CL44 constructions are marked with (*) and (°°), respectively. The letter T refers 
-to observed tryptic cleavage of the gp120 segment, and the peptides are ordered sequentially 
starting at the N-terminus of the molecule. Lower case letters following the T number 
indicate other unexpected proteolytic cleavages. The letter H refers to the observed tryptic 
cleavage of the herpes simplex gD1 protein portion of CL44. Peptide T2' contains the fusion 
site in CL44. The cysteine residues of gp1 20 are shaded, and potential N-glycosylation sites 
are indicated with a dot above the corresponding asparagine residue. 



FIGURE 2 shows a reversed-phase HPLC tryptic map of RCM CL44. This 
chromatogram was generated with 7.5 nmol of trypsin-digested RCM CL44. Chromatography 
conditions were as described in Experimental Procedures. Peaks were collected and identified 
by AAA and in some cases confirmed by N-terminal sequence analysis (Table I). Identified 
20 peaks are labelled according to the nomenclature given in Figure 1. Peptides containing 
potential tryptic sites that were not hydrolyzed are designated by two T numbers separated 
by a comma. 

FIGURE 3 shows a reversed-phase HPLC tryptic map of 9AA. This chromatogram was 
25 generated with 6.8 nmol of sample. Chromatography conditions were as described in the 
Example herein. Peaks containing cysteine residues were identified by N-terminal sequence 
* analysis, these identifications are summarized in Table II. 

FIGURE 4 shows the results of further manipulations of tryptic peptides from the map 
of 9AA to isolate individual disulfides. The chromatograms are details of microbore 
30 reversed-phase HPLC separations of peptides resulting from: (a) treatment of peptides T1 2, 
- T13, and T14 (Peak C, Figure 3) with PNGase F followed by endoproteinase Asp-N, (b) 
treatment of peptides T3, T4, and T1 1 (Peak F, Figure 3) with PNGase F followed by 
endoproteinase Asp-N, and (c) treatment of peptides T28 and T31 (Peak D, Figure 3) with 
S. aureus V8 protease. Chromatography conditions were as described in the Example herein. 
35 Peak identifications were determined by N-terminal sequence analysis and are given in Table 
III. 

FIGURE 5 shows reverse-phase HPLC tryptic maps of endoglycosidase treated RCM 
CL44. The chromatograms are tryptic maps of: (a) untreated RCM CL44, (b) PNGase 
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F-treated RCM CL44, and (c) endo H-treated RCM CL44. Each tryptic map was generated 
with 7.5 nrnot of sample. Chromatography conditions were as described in Experimental 
Procedures. Peaks were collected and identified by AAA (data not shown). Glycopeptide 
peaks are labelled according to the nomenclature in Figure 1 . 
5 FIGURE 6 is a schematic representation of gp 120 of the lll B isolate of HIV-1 showing 

disulfides and glycosylation sites, wjth the amino acids represented in single-letter code [SEQ. 
ID NO. 1 01. Roman numerals label the five disulfide-bonded domains. The five hypervariable 
regions of Modrow et aL, J. ViroL 61 :570-578 (1 987) are enclosed in boxes and labelled VI - 
V5. Glycosylation sites containing high mannose-type and/or hybrid-type oligosaccharide 
1 0 structures are indicated by a branching-Y symbol, and glycosylation sites containing complex- 
type oligosaccharide structures are indicated by a V-shaped symbol 

FIGURE 7 shows a schematic representation of the HIV eny glycoprotein gp1 20 of HIV- 
2, showing disulfides and potential glycosylation sites [SEQ. ID NO. 13). Glycosylation sites 
are indicated by a shaded box around a N residue. Roman numerals label five 
15 disulfide-bonded domains. 

Detailed Description of the Invention 

HIV eny is defined herein as the envelope polypeptide of Human Immunodeficiency 
Virus as described above, together with its amino acid sequence variants and derivatives 
produced by covalent modification of HIV eny or its variants in vitro, as discussed herein. 
20 As used herein, the term "HIV eny" encompasses all forms of gp120 and/or 160, e.g. 
including fragments, fusions of gp160/120 or their fragments with other peptides, and 
variantly glycosylated or unglycosylated HIV eny. The HIV eny of this invention is recovered 
free of active virus . 

HIV eny and its variants are conventionally prepared in recombinant cell culture. For 
25 example, see EP publication No. 187041. Henceforth, gp120 prepared in recombinant cell 
culture is referred to as rgpl 20. Recombinant synthesis is preferred for reasons of safety and 
economy, but it is known to prepare peptides by chemical synthesis and to purify HIV eny 
from viral culture; such gny preparations are included within the definition of HIV eny herein. 

30 Genes encoding HIV eny are obtained from the genomic cDNA of an HIV strain or from 

available subgenomic clones containing the gene encoding HIV env . 

. This invention is directed to isolated polypeptides. Certain of these isolated 
polypeptides are defined as cyclized polypeptides comprising a particular amino acid 
sequence, and certain isolated polypeptides are described by reference to specific amino acid 

35 residue numbers. The amino acid numbering reflects the mature HIV-1 gp120 amino acid 
sequence as shown by Fig. 6. and Fig. 1A [SEQ. ID NO. 10], not counting any signal 
sequence or other upstream regions, and is used throughout this description to conveniently 
connote the intended residues, however it is understood that this invention is not limited to 
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those specific residue numbers. For gp120 sequences which include the native HIV-IIIB N- 
terminal signal sequence, numbering may differ. The same nucleotide and amino acid residue 
numbers may not be applicable in other strains where upstream deletions or insertions change 
the length of the viral genome and HIV £jnv, but the region encoding this portion of gpl20 
5 is readily identified by reference to the teachings herein. Also, variant signal sequences (such 
as those resulting from a fusion with a fragmented or heterologous signal sequence as 
discussed below may lead to a slightly different numbering, however the precise amino acid 
sequences are discerned for all embodiments by reference to Fig. 6 and/or Fig 1 A [SEQ. ID 
NO. 10J. 

1 0 Included within the scope of the isolated polypeptides of this invention, as those terms 

are used herein are polypeptides having specified amino acid sequences, deglycosylated or 
unglycosylated derivatives, homologous amino acid sequence variants, and homologous in 
w/ro-generated variants and derivatives, and which variants are capable of exhibiting a 
biological activity in common with the HIV env of Fig. 6 or Fig. 7. 
15 Isolated polypeptide biological activity is defined as either 1) immunological cross- 

reactivity with at least one isolated polypeptide, or 2) the possession of at least one adhesive 
or effector function qualitatively in common with the isolated polypeptide. Examples of the 
qualitative biological activities of an isolated polypeptide include the ability to bind to the viral 
receptor CD4 or known monoclonal antibodies, and the ability of gp120 to interact with gp41 
20 to induce fusion of the viral and host cell membranes. 

Immunologically cross-reactive as used herein means that the candidate polypeptide 
is capable of competitively inhibiting the qualitative biological activity of an isolated 
polypeptide having this activity with polyclonal antisera raised against the known active 
analogue. Such antisera are prepared in conventional fashion by injecting goats or rabbits, 
for example, subcutaneously with the known active analogue in complete Freund's adjuvant, 
followed by booster intraperitoneal or subcutaneous injection in incomplete Freunds. 

The ordinarily skilled worker may use the disulfide bonding pattern within gp120 and 
* the positions of actual oligosaccharide moieties on the molecule as described herein for 
directing mutagenesis and fragmentation variants of the claimed isolated polypeptides. It is 
30 - intended that the variants of this invention include isolated polypeptides in which one or more 
residues have been substituted, deletions of one or more residues, and insertions of one or 
more amino acid residues. 

This invention also contemplates amino acid sequence variants of the isolated 
polypeptides. Amino acid sequence variants are prepared with various objectives in mind, 
including increasing the affinity of the isolated polypeptide for a ligand or antibody, facilitating 
the stability, purification and preparation of the isolated polypeptide, modifying its plasma half 
life, improving therapeutic efficacy, and lessening the severity or occurrence of side effects 
during therapeutic use of the isolated polypeptide. In the discussion below, amino acid 



25 



35 
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sequence variants of the isolated polypeptide are provided, exemplary of the variants that 
may be selected. 

Amino acid sequence variants of isolated polypeptide fall into one or more of three 
classes: Insertional, substitutional, or deletional variants. These variants ordinarily are 
5 prepared by site-specific mutagenesis of nucleotides in the DNA encoding the isolated 
polypeptide, by which DNA encoding the variant is obtained, and thereafter expressing the 
DNA in recombinant cell culture. However, fragments having up to about 100 150 amino 
acid residues are prepared conveniently by in vitro synthesis. The following discussion 
applies to any isolated polypeptide to the extent it is applicable to its structure or function. 

10 Tne arnino acid sequence variants of the isolated polypeptide are predetermined 

.variants not found in nature or naturally occurring alleles. The isolated polypeptide variants 
typically exhibit the same qualitative biological-f or example, antibody binding-activity as the 
naturally occurring isolated polypeptide or isolated polypeptide analogue. However, isolated 
polypeptide variants and derivatives that are not capable of binding to antibodies are useful 

1 5 nonetheless (a) as a reagent in diagnostic assays for isolated polypeptide or antibodies to the 
isolated polypeptide, (b) when insofubilized in accord with known methods, as agents for 
purifying anti-isolated polypeptide antibodies from antisera or hybridoma culture 
supernatants, and (c) as immunogens for raising antibodies to isolated polypeptide or as 
immunoassay kit components {labelled, as a competitive reagent for the native isolated 

20 polypeptide or unlabelled as a standard for isolated polypeptide assay) so long as at least one 
isolated polypeptide epitope remains active. 

While the site for introducing an amino acid sequence variation is predetermined, the 
mutation per se need not be predetermined. For example, in order to optimize the 
performance of a mutation at a given site, random or saturation mutagenesis {where all 20 

25 possible residues are inserted) is conducted at the target codon and the expressed isolated 
polypeptide variant is screened for the optimal combination of desired activities. Such 
screening is within the ordinary skill in the art. 

Amino acid insertions usually will be on the order of about from 1 to 10 amino acid 
residues; substitutions are typically introduced for single residues; and deletions will range 

30 about from 1 to 30 residues. Deletions or insertions preferably are made in adjacent pairs, 
i.e. a deletion of 2 residues or insertion of 2 residues. It will be amply apparent from the 
following discussion that substitutions, deletions, insertions or any combination thereof are 
introduced or combined to arrive at a final construct. Insertional amino acid sequence 
variants of the isolated polypeptide are those in which one or more amino acid residues 

35 extraneous to the isolated polypeptide are introduced into a predetermined site in the target 
isolated polypeptide and which displace the preexisting residues. 

Commonly, insertional variants are fusions of heterologous proteins or polypeptides to 
the amino or carboxyl. terminus of the isolated polypeptide. Such variants are referred to as 
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fusions of the isolated polypeptide and a polypeptide containing a sequence which is other 
than that which is normally found in the isolated polypeptide at the inserted position. Several 
groups of fusions are contemplated herein. 

The novel isolated polypeptides of this invention are useful in diagnostics or in 
purification of the antibodies or ligands by known immunoaffinity techniques. 

Desirable fusions of the isolated polypeptide, which may or may not also be 
immunologically active, include fusions of the mature isolated polypeptide sequence with a 
signal sequence heterologous to a native isolated polypeptide as mentioned above. Signal 
sequence fusions are employed in order to more expeditiously direct the secretion of the 
isolated polypeptide. The heterologous signal replaces the native isolated polypeptide signal, 
and when the resulting fusion is recognized, i.e. processed and cleaved by the host cell, the 
isolated polypeptide is secreted. Signals are selected based on the intended host cell, and 
may include bacterial yeast, mammalian and viral sequences. The native HIV env signal or 
the herpes gD glycoprotein signal is suitable for use in mammalian expression systems. 

C-terminal or N-terminal fusions of the isolated polypeptide or isolated polypeptide 
fragment with an immunogenic hapten or heterologous polypeptide are useful as vaccine 
components for the immunization of patients against HIV infection. Fusions of the hapten 
or heterologous polypeptide with isolated polypeptide or its active fragments which retain T- 
cell binding activity are also useful in directing cytotoxic T cells against target cells where the 
hapten or heterologous polypeptide is capable of binding to a target cell surface receptor. 

The precise site at which the fusion is made is variable; particular isolated polypeptide 
sites are selected in order to optimize the biological activity, secretion or binding 
characteristics of the isolated polypeptide. The optimal site will for a particular application 
will be determined by routine experimentation. 

Substitutional variants are those in which at least one residue in the isolated 
polypeptide has been removed and a different residue inserted in its place. Such substitutions 
generally are made in accordance with the following Table 1 when it is desired to finely 
modulate the characteristics of the isolated polypeptide. 
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T»BT.R 1 

Original Residue Exemplary Substitutions 





Ala 


ser 








Arg 


lys 






5 


Asn 
Asp 


gin; 
glu 


his 






Cys 


ser; 


aia 






Gin 


asn 








Glu 


asp 








Gly 


pro 








Hie 


asn; 


gin 




5; 


He 


leu; 


val 






Leu 


ile; 


val 






Lys 

.i 


arg; 


gin; 


glu 


15 


Met 


leu; 


ile 






Phe 


met; 


leu; 


tyr 




Ser 


thr. 








Thr 


ser 








Trp 


tyr 






20 




trp; 


phe 






Val 


ile; 


leu 





Novel amino acid sequences, as well as isosteric analogs (amino acid or otherwise), as 
included within the scope of this invention. 

Substantial changes in function or immunological identity are made by selecting 
substitutions that are less conservative than those in Table 1, i.e., selecting residues that 
differ more significantly in their effect on maintaining (a) the structure of the polypeptide 
backbone in the area of the substitution, for example as a sheet or helical conformation, (b) 
the charge or hydrophobicity of the molecule at the target site or (c) the bulk of the side 
chain. The substitutions which in general are expected to produce the greatest changes in 
isolated polypeptide properties will be those in which (a) a hydrophilic residue, e.g. seryl or 
threonyl, is substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenytalahyl, 
valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a 
residue having an electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for 
(or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky 
side chain, e.g., phenylalanine, is substituted for tor by) one not having a side chain, e.g., 
glycine. 

Some deletions, insertions, and substitutions will not produce radical changes in the 
characteristics of the isolated polypeptide molecule. However, when it is difficult to predict 
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the exact effect of the substitution, deletion, or insertion in advance of doing so, for example 
when modifying an immune epitope, one skilled in the art will appreciate that the effect will 
be evaluated by routine screening assays. For example, a variant typically is made by site 
specific mutagenesis of the isolated polypeptide -encoding nucleic acid, expression of the 
5 variant nucleic acid in recombinant cell culture and, optionally, purification from the cell 
culture for example by immunoaffinity adsorption on a polyclonal anti-isolated polypeptide 
column (in order to adsorb the variant by at least one remaining immune epitope). The 
activity of the cell lysate or purified isolated polypeptide variant is then screened in a suitable 
screening assay for the desired characteristic. For example, a change in the immunological 
1 0 character of the isolated polypeptide, such as affinity for T-cell binding, is measured by a 
competitive-type immunoassay. As more becomes known about the functions in vivo of the 
isolated polypeptide other assays will become useful in such screening. Modifications of 
such protein properties as redox or thermal stability, hydrophobicity, susceptibility to 
proteolytic degradation, or the tendency to aggregate with carriers or into multimers are 
15 assayed by methods well known to the artisan. 

Another class of isolated polypeptide variants are deletional variants. Deletions are 
characterized by the removal of one or more amino acid residues from the isolated 
polypeptide sequence. Typically, deletions are used to affect isolated polypeptide biological 
activities, however, deletions which preserve the biological activity or immune cross-reactivity 
20 of the isolated polypeptide are suitable. 

Deletions of cysteine or other labile residues also may be desirable, for example in 
increasing the oxidative stability of the isolated polypeptide. Deletion or substitutions of 
potential proteolysis sites, e.g. Arg Arg, is accomplished by deleting one of the basic residues 
or substituting one by glutaminyl or histidyl residues. 
25 It will be understood that some variants may exhibit reduced or absent biological 

^ activity. These variants nonetheless are useful as standards in immunoassays for the isolated 
polypeptide so long as they retain at least one immune epitope of the isolated polypeptide. 

It is presently believed that the three-dimensional structure of the isolated polypeptides 
and peptide compositions of the present invention is important to their functioning as 
30 described herein. Therefore, all related structural analogs which mimic the active structure 
h of those formed by the isolated polypeptides claimed herein are specifically induced within 

the scope of the present invention. 
^ Glycosylation variants are included within the scope of the isolated polypeptide. They 

include variants completely lacking in glycosylation (unglycosylated) and variants having at 
35 least one less glycosylated site than the native form (deglycosylated) as well as variants in 
which the glycosylation has been changed. Included are deglycosylated and unglycosylated 
amino acid sequence variants, deglycosylated and unglycosylated isolated polypeptide having 
the native, unmodified amino acid sequence of the isolated polypeptide, and other 
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glycosylation variants. For example, substitutional or deletional mutagenesis is employed to 
eliminate the N- or O-linked glycosylation sites of the isolated polypeptide, e.g., an asparagine 
residue (not at the clip site) is deleted or substituted for by another basic residue such as 
lysine or histidine. Alternatively/ flanking residues making up the glycosylation site are 
5 substituted or deleted, even though the asparagine residues remain unchanged, in order to 
prevent glycosylation by eliminating the glycosylation recognition site. 

Unglycosylated isolated polypeptide which has the amino acid sequence of the native 
isolated polypeptide is produced in recombinant prokaryotic cell culture because prokaryotes 
are incapable of introducing glycosylation into polypeptides. 

*0 Glycosylation variants are produced by selecting appropriate host cells or by in vitro 

methods. Yeast, for example, introduce glycosylation which varies significantly from that of 
mammalian systems. Similarly, mammalian cells having a different species (e.g. hamster, 
murine, insect, porcine, bovine or ovine) or tissue origin (e.g. lung, liver, lymphoid, 
mesenchymal or epidermal) than the source of the isolated polypeptide antigen are routinely 

1 5 screened for the ability to introduce variant glycosylation as characterized for example by 
elevated levels of mannose or variant ratios of mannose, fucose, sialic acid, and other sugars 
typically found in mammalian glycoproteins, in vitro processing of the isolated polypeptide 
typically is accomplished by enzymatic hydrolysis, e.g. neuraminidase digestion. 

Covalent modifications of the isolated polypeptide molecule which do not modify the 

20 clip site are included within the scope hereof. Such modifications are introduced by reacting 
targeted amino acid residues of the recovered protein with an organic derivatizing agent that 
is capable of reacting with selected side chains or terminal residues, or by harnessing 
mechanisms of post-transiational modification that function in selected recombinant host 
cells. The resulting covalent derivatives are useful in programs directed at identifying residues 

25 important for biological activity, for immunoassays of isolated polypeptide or for the 
preparation of anti-isolated polypeptide antibodies for immunoaffinity purification of the 
recombinant isolated polypeptide. For example, complete inactivation of the biological 
activity of the protein after reaction with ninhydrin would suggest that at least one arginyl 
or lysyl residue is critical for its activity, whereafter the individual residues which were 

30 modified under the conditions selected are identified by isolation of a peptide fragment 
containing the modified amino acid residue. Such modifications are within the ordinary skill 
in the art and are performed without undue experimentation. 

Derivatization with Afunctional agents is useful for preparing intermolecular aggregates 
of the isolated polypeptide with polypeptides as well as for cross-linking the isolated 

35 polypeptide to a water insoluble support matrix or surface for use in the assay or affinity 
purification of its Hgands. In addition, a study of intrachain cross-links will provide direct 
information on conformational structure. Commonly used cross-linking agents include 
sulfhydryl reagents, 1 , 1 -bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N- 
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hydroxysuccinimide esters, for example esters with 4-azidosaNcylic acid, homobifunctional 
imidoesters including disuccinimidyl esters such as 3,3'-dithiobis (succinimidyl-propionate), 
and Afunctional maleimides such as bis-N-maleimido-1 ,8-octane. Derivatizing agents such 
as methyl-3-[(p-azido-phenyl)dithio] propioimidate yield photoactivatabJe intermediates which 
5 are capable of forming cross-links in the presence of light. Alternatively, reactive water 
insoluble matrices such as cyanogen bromide activated carbohydrates and the systems 
reactive substrates described in U.S. patents 3,959,080; 3,969,287; 3,691,016; 4,195,128; 
4,247,642; 4,229,537; 4,055,635; and 4,330,440 are employed for protein immobilization 
and cross-linking. 

1 0 Polymers generally are covalently linked to the isolated polypeptide herein through a 

multifunctional crosslinking agent which reacts with the polymer and one or more amino acid 
or sugar residues of protein. However, it is within the scope of this invention to directly 
crosslink the polymer by reacting a derivatized polymer with the isolated polypeptide, or vice 
versa. Covalent bonding to amino groups is accomplished by known chemistries based upon 
15 cyanuric chloride, carbonyl diimidazole, aldehyde reactive groups (PEG alkoxide plus diethyl 
acetal of bromoacetaldehyde; PEG plus DMSO and acetic anhydride, or PEG chloride plus the 
phenoxide of 4-hydroxybenzaldehyde, succinimidyl active esters, activated dithiocarbonate 
PEG, 2,4,5-trichlorophenylchloroformate or p-nitrophenylchloroformate activated PEG. 
Carboxyl groups are derivatized by coupling PEG-amine using carbodiimide. 
20 Tn,s invention is also directed to polypeptides of this invention which by definition or 

optionally are conformationally stabilized by cyclization. The peptides ordinarily are cyclized 
by covalently bonding the N and C -terminal domains of one peptide to the corresponding 
domain of another peptide of this invention so as to form cyclooligomers containing two or 
more iterated peptide sequences, each internal peptide having substantially the same 
25 sequence. Further, cyclized peptides (whether cyclooligomers or cylomonomers) are 
crosslinked to form 1 -3 cyclic structures having from 2 to 6 peptides comprised therein. The 
peptides preferably are not covalently bonded through a-amino and -carboxyl groups (head 
to tail), but rather are cross-linked through the side chains of residues located in the N and 
C-terminal domains. The linking sites thus generally will be between the side chains of A, 
30 and A 10 residues. Substantially identical polypeptides present in the polymerized forms of the 
peptides hereof are those which exhibit qualitative isolated polypeptide activity, 
notwithstanding the degree of amino acid sequence variation among the polypeptides. 
Variants which exhibit activity are used as subunits in homo or heteropolymers. In 
homopolymers the peptides are the same. Heteropolymers contain different peptides, each 
35 however, chosen from within the parameters described above. 

Many suitable methods perse are known for preparing mono- or poly-cyclized peptides 
as contemplated herein. Lys/Asp cyclization has been accomplished using No-Boc-amino 
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acids on solid-phase support with Fmoc/OFm side-chain protection for Lys/Asp; the process 
is completed by piperidine treatment followed by BoP cyclization. 

Glu and Lys side chains also have been crosslinked in preparing cyclic or bicyclic 
peptides: the peptide is synthesized by solid phase chemistry on a p-methylbenzhydrylamine 
5 resin. The peptide is cleaved from the resin and deprotected. The cyclic peptide is formed 
using diphenylphosphorytazide in dilute dimethylf ormamide. For an alternative procedure, see 
Schiller et aL, "Peptide Protein Res." 25:171-177 (1985). See also U.S. Patent 4,547,489. 

Disulfide crosslinked or cyclized peptides are generated by conventional methods. The 
method of Pelton et aL (J. Med. Chem. 22:2370-2375 [1986]) is suitable, except that a 
10 greater proportion of cyclooligomers are produced by conducting the reaction in more 
concentrated solutions than the dilute reaction mixture described by Pelton et ah for the 
production of cyclomonomers. The same chemistry is useful for synthesis of dimers (using 
• A iA ? en P'us A,-A 0 Cys) or cyclooligomers or cyclomonomers (Pen A^A^ Cys, or Pen A r 
A 10 Cys plus Cys A,-A 10 Pen). Also useful are thiomethylene bridges (Tetrahedron Letters 
15 25(20):2067-2068 [1984]). See also Cody et aL, J. Med. Chem. 28:583 (1985). 

The desired cyclic or polymeric peptides are purified by gel filtration followed by 
reversed-phase high pressure liquids chromatography or other conventional procedures. The 
peptides are sterile filtered and formulated into conventional pharmacologically acceptable 
vehicles. 

20 Certain post-translational derivatizations are the result of the action of recombinant 

host cells on the expressed polypeptide. Glutaminyl and asparaginyl residues are frequently 
post-translationally deamidated to the corresponding glutamyl and aspartyl residues 
Alternatively, these residues are deamidated under mildly acidic conditions. Either form of 
these residues falls within the scope of this invention. 

25 Other post-translational modifications include hydroxylation of proline and lysine, 

phosphorylation of hydroxy! groups of seryl or threonyl residues, metbylation of the a-amino 
^groups of lysine, arginine, and histidine side chains (T.E. Creiohton/ Proteins: Structure and 
Molecular Properties, W.H. Freeman & Co., San Francisco pp 79-8611983]), acetylation of 
the N-terminal amine and, in some instances, amidation of the C-terminal carboxyl. 

30 P NA encoding the isolated polypeptide is synthesized by in vitro methods or is obtained 

readily from cDNA libraries. The means for synthetic creation of the'. QN A encoding the 
isolated polypeptide, either by hand or with ah automated apparatus, are generally known to 
one of ordinary skill in the art, particularly in light of the teachings contained herein. As 
examples of the current state of the art relating to polynucleotide synthesis, one is directed 

35 to Maniatis et aL. Molecular Ooning--A Laboratory Manual, Cold Spring Harbor Laboratory 
(1 984), and Horvath et aL, An Automated DNA Synthesizer Employing Deoxynucleoside 3 
Phosphor amidites, Methods in Enzymology 154: 313-326, 1987. 
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Alternatively, to obtain DNA encoding the isolated polypeptide, one needs only to 
conduct hybridization screening with labelled DNA encoding either the isolated polypeptide 
or isolated polypeptide fragment {usually, greater than about 20, and ordinarily about 50bp) 
in order to detect clones which contain homologous sequences in the cDNA libraries derived 
5 from cells or tissues of a particular animal, followed by analyzing the clones by restriction 
enzyme analysis and nucleic acid sequencing to identify full-length clones. If full length 
clones are not present in the library, then appropriate fragments are recovered from the 
various clones and ligated at restriction sites common to the fragments to assemble a full- 
length clone. DNA encoding isolated polypeptide from various isotypes and strains is 

1 0 obtained by probing libraries from hosts of such species with the amino acid sequences of 
the isolated polypeptide, or by synthesizing the genes in vitro. 

In general, prokaryotes are used for cloning of DNA sequences in constructing the 
vectors useful in the invention. For example, £. co//K12 strain 294 (ATCC No. 31446) is 
particularly useful. Other microbial strains which may be used include E. co/i B and £, coli 

15 X1776 (ATCC No. 31537). These examples are illustrative rather than limiting. 
Alternatively, in vitro methods of cloning, e.g. polymerase chain reaction, are suitable. 

The isolated polypeptides of this invention are expressed directly in recombinant cell 
culture as an N-terminal methionyl analogue, or as a fusion with a polypeptide heterologous 

20 to the hybrid/portion, preferably a signal sequence or other polypeptide having a specific 
cleavage site at the N-terminus of the hybrid/portion. For example, in constructing a 
prokaryotic secretory expression vector for the isolated polypeptide, the native isolated 
polypeptide signal is employed with hosts that recognize that signal. When the secretory 
leader is "recognized" by the host, the host signal peptidase is capable of cleaving a fusion 

25 of the leader polypeptide fused at its C-terminus to the desired mature isolated polypeptide. 
For host prokaryotes that do not process the native isolated polypeptide signal, the signal is 
substituted by a prokaryotic signal selected for example from the group of the alkaline 
phosphatase, penicillinase, Ipp or heat stable enterotoxin II leaders. For yeast secretion the 
native isolated polypeptide signal may be substituted by the yeast invertase, alpha factor or 

30 acid phosphatase leaders. In mammalian cell expression the native isolated polypeptide signal 
or native HIV env signal is satisfactory for certain isolated polypeptides, although other 
mammalian secretory protein signals are suitable, as are viral secretory leaders, for example 
* the herpes simplex gD signal. 

The isolated polypeptide may be expressed in any host cell, but preferably is 

35 synthesized in mammalian hosts. However, host cells from prokaryotes, fungi, yeast, insects 
and the like are also are used for expression. Exemplary prokaryotes are the strains suitable 
for cloning as well as E. coli W3110 (PvT- prototrophic, ATTC No. 27325), other 
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enterobacteriaceae such as Serratia marcescans, bacilli and various pseudomonads. 
Preferably the host cell should secrete minimal amounts of proteolytic enzymes. 

Expression hosts typically are transformed with DNA encoding the isolated polypeptide 
which has been ligated into an expression vector. Such vectors ordinarily carry a replication 
5 site (although this is not necessary where chromosomal integration will occur). Expression 
vectors also include marker sequences which are capable of providing phenotypic selection 
in transformed cells, as will be discussed further below. For example, £. coli is typically 
transformed using pBR322, a plasmid derived from an £. coli species (Bolivar, era/.. Gene 2: 
95 11977]). pBR322 contains genes for ampicillin and tetracycline resistance and thus 

10 provides easy means for identifying transformed cells, whether for purposes of cloning or 
expression. Expression vectors also optimally will contain sequences which are useful for the 
control of transcription and translation, e.g., promoters and Shine-Dalgarno sequences (for 
prokaryotes) or promoters and enhancers (for mammalian cells). The promoters may be, but 
need not be, inducible; even powerful constitutive promoters such as the CMV promoter for 

1 5 mammalian hosts may produce the isolated polypeptide without host cell toxicity. While it 
is conceivable that expression vectors need not contain any expression control, repiicative 
sequences or selection genes, their absence may hamper the identification of transformants 
and the achievement of high level peptide expression. 

Promoters suitable for use with prokaryotic hosts illustratively include the ^-lactamase 

20 and lactose promoter systems (Chang et ah, Nature 275: 615 [19781; and Goeddel et aL, 
Nature ' 281: 544 [19791), alkaline phosphatase, the tryptophan (up) promoter system 
(Goeddel, Nucleic Acids Res. 8: 4057 (1980) and EPO Appln. Publ. No. 36,776) and hybrid 
promoters such as the tac promoter (H. de Boer et aL. Proc. Nat/. Acad. ScL USA 80: 21-25 
11983]). However, other functional bacterial promoters are suitable. Their nucleotide 

25 sequences are generally known, thereby enabling a skilled worker operably to Hgate them to 
DNA encoding the isolated polypeptide (Siebenlist et aL, Cell 20: 269 [1980]) using linkers 
QT ; adaptors to supply any required restriction sites. Promoters for use in bacterial systems 
also will contain a Shine-Dalgarno (S.D.) sequence operably linked to. the DNA encoding the 
isolated polypeptide. 

■3-0 In addition to prokaryotes, eukaryotic microbes such as yeast or filamentous fungi are 

satisfactory. Saccharomyces cerevisiae is the most commonly used eukaryotic 
microorganism, although a number of other strains are commonly available. The plasmid YRp7 
is a satisfactory expression vector in yeast (Stinchcomb, et aL, Nature 282: 39 (1979); 
Kingsman et al, Genel: 141 (1979); Tschemper et aL, Gene 10: 157 (1980)). Thisplasmid 

35 already contains the trpl gene which provides a selection marker for a mutant strain of yeast 
lacking the ability to grow in tryptophan, for example ATCC no. 44076 or PEP4-1 (Jones, 
Genetics 85; 12 [19771). The presence of the trpl lesion as a characteristic of the yeast 
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host cell genome then provides an effective environment for detecting transformation by 
growth in the absence of tryptophan. 

Suitable promoting sequences for use with yeast hosts include the promoters for 3- 
phosphoglycerate kinase (Hitzeman et at., J. BioL Chem. 255: 2073 (1980)) or other 
5 glycolytic enzymes (Hess et aL, J. Adv. Enzyme Reg. 7: 149 (1968); and Holland, 
Biochemistry 17:4900 (1978)), suchasenolase, glyceraldehyde-3-phosphate dehydrogenase, 
hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 
3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose 
isomerase, and glucokinase. 
10 Other yeast promoters, which are inducible promoters having the additional advantage 

of transcription controlled by growth conditions, are the promoter regions for alcohol 
dehydrogenase 2, isocytbchrome C, acid phosphatase, degradative enzymes associated with 
r nitrogen metabolism, metallothionein, glyceraldehyde-3-phosphate dehydrogenase, and 

enzymes responsible for maltose and galactose utilization. Suitable vectors and promoters 
15 for use in yeast expression are further described in R. Hitzeman et al., European Patent 
Publication No. 73,657A. 

Expression control sequences are known for eucaryotes. Virtually all eukaryotic genes 
have an AT-rich region located approximately 25 to 30 bases upstream from the site where 
transcription is initiated. Another sequence found 70 to 80 bases upstream from the start 
20 of transcription of many genes is a CXCAAT region where X may be any nucleotide. At the 
3' end of most eukaryotic genes is an AATAAA sequence which may be the signal for 
addition of the poly A tail to the 3' end of the coding sequence. All of these sequences are 
inserted into mammalian expression vectors. 

Suitable promoters for controlling transcription from vectors in mammalian host cells 
25 are readily obtained from various sources, for example, the genomes of viruses such as 
■ <: polyoma virus, SV40, adenovirus, MMV (steroid inducible), retroviruses (e.g. the LTR of HIV), 
hepatitis-B virus and most preferably cytomegalovirus, or from heterologous mammalian 
promoters, e.g. the beta actin promoter. The early and late promoters of SV40 are 
conveniently obtained as an SV40 restriction fragment which also contains the SV40 viral 
30 origin of replication. Fiers et aL, Nature, 273: 1 13 (1978). The immediate early promoter 
" of the human cytomegalovirus is conveniently obtained as a Hindlll E restriction fragment. 

Greenaway, P.J. et aL, Gene 18: 355-360 (1982). 
u> Transcription of a DNA encoding the isolated polypeptide by higher eukaryotes is 

increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting 
35 elements of DNA, usually about from 10-300bp, that act on a promoter to increase its 
transcription. Enhancers are relatively orientation and position independent having been 
found 5' (Laimins et a/., PNAS 78: 993 11981 J) and 3' (Lusky, M.L., et aL, MoL Cett Bio. 3: 
1108 (1983)) to the transcription unit, within an intron (Banerji, J.L. et aL, Cell 33: 729 
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(1983)) as well as within the coding sequence itself (Osborne, T.F., et aL, MoL Cell Bio. 4: 
1293 [1984]). Many enhancer sequences are now known from mammalian genes (globin, 
elastase, albumin, a-fetoprotein and insulin). Typically, however, one will use an enhancer 
from a eukaryotic cell virus. Examples include the SV40 enhancer on the late side of the 
5 replication origin (bp 100-270), the cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, 
human or nucleated cells from other multicellular organisms) will also contain sequences 
necessary for the termination of transcription which may affect mRNA expression. These 
10 regions are transcribed as polyadenylated segments in the untranslated portion of the mRNA 
^encoding the hybrid immunoglobulin. The 3' untranslated regions also include transcription 
^termination sites.. 

Expression vectors may contain a selection gene, also termed a selectable marker. 
Examples of suitable selectable markers for mammalian cells are dihydrofolate reductase 

1 5 (DHFR), thymidine kinase (TK) or neomycin. When such selectable markers are successfully 
transferred into a mammalian host cell, the transformed mammalian host cell is able to survive 
if placed under selective pressure. There are two widely used distinct categories of selective 
regimes. The first category is based on a cell's metabolism and the use of a mutant cell line 
which lacks the ability to grow independent of a supplemented media. Two examples are 

20 CHO DHFR" cells and mouse LTK cells. These cells lack the ability to grow without the 
addition of such nutrients as thymidine or hypoxanthine. Because these cells lack certain 
genes necessary for a complete nucleotide synthesis pathway, they cannot survive unless the 
missing nucleotides are provided in a supplemented media. An alternative to supplementing 
the media is to introduce an intact DHFR or TK gene into cells lacking the respective genes, 

25 thus altering their growth requirements. Individual cells which were not transformed with the 
DHFR or TK gene will not be capable of survival in non-supplemented media. In preferred 
embodiments, herein, CHO cells which are DHFR* are used for recombinant expression of the 
isolated polypeptide* 

The second category of selective regimes is dominant selection which refers to a 
30 selection scheme used in any cell type and does not require the use of a mutant cell'line. 
These schemes typically use a drug to arrest growth of a host cell. Those cells which are 
successfully transformed with a heterologous gene express a protein conferring drug 
resistance and thus survive the selection regimen. Examples of such dominant selection use 
the drugs neomycin (Southern et aL, J, Motec. AppL Genet. .''1 : 327 (1982)), mycophenoiic 
35 acid (Mulligan et aL, Science 209: 1 422 (1 980)) or hygromycin (Sugden et aL, MoL Cell. BioL 
5: 410-413 (1985)). The three examples given above employ bacterial genes under 
eukaryotic control to convey resistance to the appropriate drug G41 8 or neomycin (geneticin), 
xgpt (mycophenoiic acid) or hygromycin, respectively. 
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"Amplification" refers to the increase or replication of an isolated region within a cell's 
chromosomal DNA. Amplification is achieved using a selection agent, e.g. methotrexate 
(MTX) which inactivates DHFR. Amplification or the making of successive copies of the 
DHFR gene results in greater amounts of DHFR being produced in the face of greater amounts 
5 of MTX. Amplification pressure is applied notwithstanding the presence of endogenous 
DHFR, by adding ever greater amounts of MTX to the media. Amplification of a desired gene 
can be achieved by cotransfecting a mammalian host cell with a plasmid having a DNA 
encoding a desired protein and the DHFR or amplification gene permitting cointegration. One 
ensures that the ceil requires more DHFR, which requirement is met by replication of teh 
10 selection gene, by selecting only for cells that can grow in teh presence of ever-greater MTX 
v concentration. So long as the gene encoding a desired heterologous protein has cointegrated 
with the selection gene replication of this gene gives rise to replication of the gene encoding 
the desired protein. The result is that increased copies of the gene, i.e. an amplified gene, 
encoding the desired heterologous protein express more of the desired protein. 
1 5 Suitable eukaryotic host cells for expressing the isolated polypeptide include monkey 

kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); human embryonic kidney 
line (293 or 293 cells subcloned for growth in suspension culture, Graham, F.L. et aL, J. Gen 
ViroL 36: 59 (1977)); baby hamster kidney cells (BHK, ATCC CCL 10); Chinese hamster 
ovary-cells-DHFR (CHO, Urlaub and Chasin, PNAS (USA) 77: 4216, 11980]); mouse Sertoli 
20 cells (TM4, Mather, J. P., Biol. Reprod. 23: 243-251 (19801); monkey kidney cells (CV1 ATCC 
CCL 70); african green monkey kidney cells (VERO-76, ATCC CRL-1 587); human cervical 
carcinoma cells (HELA, ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 34); buffalo rat 
liver cells (BRL 3A, ATCC CRL 1442); human lung cells (WT38, ATCC CCL 75); human liver 
cells (Hep G2, HB 8065); mouse mammary tumor (MMT 060562, ATCC CCL51); and, TRI 
25 cells (Mather, J. P. et aL, Annals N. Y. Acad. ScL 383: 44-68 11982]). 

**■ Construction of suitable vectors containing the desired coding and control sequences 

employ standard ligation techniques. Isolated plasmids or DNA fragments are cleaved, 
tailored, and religated in the form desired to form the plasmids required. 

For analysis to confirm correct sequences in plasmids constructed, the ligation mixtures 
30 are used to transform £. co//K12 strain 294 (ATCC 31446) and successful transformants 
selected by ampicillin or tetracycline resistance where appropriate. Plasmids from the 
transformants are prepared, analyzed by restriction and/or sequenced by the method of 
■h Messing et at., Nucleic Acids Res. 9: 309 (1981) or by the method of Maxam et aL, Methods 
in Enzymology 65: 499 (1 980). 
35 Host cells are transformed with the expression vectors of this invention and cultured 

in conventional nutrient media modified as appropriate for inducing promoters, selecting 
transformants or amplifying the genes encoding the desired sequences. The culture 
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conditions, such as temperature, pH and the like, are those previously used with the host cell 
selected for expression, and wjll be apparent to the ordinarily skilled artisan. 

The host cells referred to in this disclosure encompass cells in in vitro cultur as well 
as cells which are within a host animal. 
5 "Transformation* means introducing DNA into an organism so that the DNA is 

replicable, either as an extrachromosomal element or by chromosomal integration. Unless 
indicated otherwise, the method used herein for transformation of the host cells is the 
method of Graham, F. and van der Eb, A., Virology 52: 456-457 (1973). However, other 
methods for introducing DNA into cells such as by nuclear injection or by protoplast fusion 
10 may also be used. If prokaryotic cells or cells which contain substantial cell wall 
.constructions are used, the preferred method of transfection is calcium treatment using 
.calcium chloride as described by Cohen, F.N. et aL.Proc. Natl. Acad. ScL (USA), 69: 2110 
(1972). 

"Transfection" refers to the introduction of DNA into a host cell whether or not any 
1 5 coding sequences are ultimately expressed. Numerous methods of transfection are known 

to the ordinarily skilled artisan, for example, CaP0 4 and electroporation. Transformation of 

the host cell is the indicia of successful transfection. 

The novel polypeptide of this invention is recovered and purified from recombinant cell 

cultures by known methods, including ammonium sulfate or ethanol precipitation, acid 
20 extraction, anion or cation exchange chromatography, phosphocetlulose chromatography, 

immunoaff inity chromatography, hydroxyapatite chromatography and lectin chromatography. 

See, e.g., the purification methods described in EP 1 87,041 . Moreover, reverse-phase HPLC 

and chromatography using ligands for the isolated polypeptide are useful for purification. It 

is presently preferred to utilize gel permeation chromatography and anion exchange 
25 chromatography, and more preferred to use cation exchange and hydrophobic interaction 

chromatography (HIC) according to standard protocols. 

^ Optionally, the isolated polypeptide is recovered and purified by passage over a column 
of isolated poly peptide-antibody covalently coupled to aldehyde silica by a standard procedure 
(Roy et aL, Journal of Chromatography 303:225-228 (1984)), washing of the column with 

30 a saline solution, and analyzing the eluant by standard methods such as quantitative amino 
acid analysis. Procedures utilizing monoclonal antibodies coupled to glycerol-coated 
controlled pore glass are desirable for the practice of this invention. Optionally, low 
concentrations (approximately 1-5 mM) of calcium ion may be present during purification. 
The isolated polypeptide may preferably be purified in the presence of a protease inhibitor 

■35.", such as PMSF. 

The isolated polypeptide is placed into pharmaceutical^ acceptable^ sterile, isotonic 
formulations together with required cof actors, and optionally are administered by standard 
means well known in the field. The formulation is preferably liquid, and is ordinarily a 
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physiologic salt solution containing non-phosphate buffer at pH 6.8-7.6, or may be lyophilized 
powder. 

The isolated polypeptide compositions to be used in therapy will be formulated and 
dosages established in a fashion consistent with good medical practice taking into account 
5 the disorder to be treated, the condition of the individual patient, the site of delivery of the 
isolated polypeptide, the method of administration and other factors known to practitioners. 

The isolated polypeptide is prepared for administration by mixing the isolated 
polypeptide at the desired degree of purity with adjuvants or physiologically acceptable 
carriers i.e. carriers which are nontoxic to recipients at the dosages and concentrations 
10 employed. Adjuvants and carriers are substances that in themselves share no immune 
epitopes with the target antigen, but which stimulate the immune response to the target 
antigen. Ordinarily, this will entail combining the isolated polypeptide with buffers, low 
molecular weight {less that about 10 residues) polypeptides, proteins, amino acids, 
carbohydrates including glucose or dextrans, chelating agents such as EDTA, and other 
15 excipients. Freunds adjuvant (a mineral oil emulsion) commonly has been used for this 
purpose, as have a variety of toxic microbial substances such as mycobacterial extracts and 
cytokines such as tumor necrosis factor and interferon gamma in U.S. patent 4,963,354. 
Although antigen is desirably administered with an adjuvant, in situations where the initial 
inoculation is delivered with an adjuvant, boosters with antigen may not require adjuvant. 
20 Carriers often act as adjuvants, but are generally distinguished from adjuvants in that carriers 
comprise water insoluble macromolecular particulate structures which aggregate the antigen, 
Typical carriers include aluminum hydroxide, latex particles, bentonite and liposomes. 

It is envisioned that injections (intramuscular or subcutaneous) will be the primary route 
for therapeutic administration of the vaccines of this invention, intravenous delivery, or 
25 delivery through catheter or other surgical tubing is also used. Alternative routes include 
tablets and the like, commercially available nebulizers for liquid formulations, and inhalation 
a<r of lyophilized or aerosolized receptors. Liquid formulations may be utilized after reconstitution 
4 from powder formulations. 

The novel polypeptide may also be administered via microspheres, liposomes, other 
30 microparticulate delivery systems or sustained release formulations placed in certain tissues 
v including blood. Suitable examples of sustained release carriers include semipermeable 
polymer matrices in the form of shaped articles, e.g. suppositories, or microcapsules. 
>>> Implantable or microcapsular sustained release matrices include polylactides (U.S. Patent 
3,773,919, EP 58,481) copolymers of L-glutamic acid and gamma ethyl-L-glutamate (U. 
35 Sidman et at., Biopofymers 22(1): 547-556, (1985)), poly (2-hydroxyethyl-methacr/late) or 
ethylene vinyl acetate (R. Langer et aL, J. Biomed. Mater. Res. 15: 167-277 (1981) and R. 
Langer, Chem. Tech. 1 2: 98-1 05 (1 982)). Liposomes containing the isolated polypeptide are 
prepared by well-known methods: DE 3,21 8,121 A; Epstein et at., Proc. Natl. Acad. Set. USA, 
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82:3688-3692 (1985); Hwang et aL, Prpc. A/at/. Acad. Set. USA, 77:4030*4034 (1980); EP 
52322A; EP 36676A; EP 88046A; EP 143949A; EP 142541 A; Japanese patent application 
83-11808; U.S. Patents 4,485,045 and 4,544,545; and UP 102,342A. Ordinarily the 
liposomes are of the small (about 200-800 Angstroms) unilamelar type in which the lipid 
5 content is greater than about 30 mol. % cholesterol, the selected proportion being adjusted 
for the optimal rate of the polypeptide leakage. 

The dose of the isolated polypeptide administered will be dependent upon the 
properties of the isolated polypeptide employed, e.g. Its binding activity and in vivo plasma 
half-life, the concentration of the isolated polypeptide in the formulation, the administration 

10 route, the site and rate of dosage, the clinical tolerance of the patient involved, the 
pathological condition afflicting the patient and the like, as is well within the skill of the 
physician. Generally, doses of from about 0.5 x 10* to 5 x 10' molar of isolated polypeptide 
per patient per administration are preferred. Different dosages are utilized during a series of 
sequential inoculations; the practitioner may administer an initial inoculation and then boost 

15 with relatively smaller doses of isolated polypeptide vaccine. 

The isolated polypeptide vaccines of this invention may be administered in a variety 
of ways and to different classes of recipients. The vaccines are used to vaccinate individuals 
who may or may not be at risk of exposure to HIV, and additionally, the vaccines are 
desirably administered to seropositive individuals and to individuals who have been previously 

20 exposed to HIV (see e.g. Salk, Nature 327:473-476 (1987); and Salk et aL, Science 
195:834-847(1977)). 

The isolated polypeptide may be administered in combination with other antigens in a 
single inoculation "cocktail". The isolated polypeptide vaccines may also be administered as 
one of a series of inoculations administered over time. Such a series may include inoculation 

25 with the same or different preparations of HIV antigens or other vaccines. 

The adequacy of the vaccination parameters chosen, e.g. dose, schedule, adjuvant 
choice and the like, is determined by taking aliquots of serum from the patient and assaying 
antibody titers during the course of the immunization program. Alternatively, the presence 
of T cells may by monitored by conventional methods as described in Example 1 below/ In 

30 addition, the clinical condition of the patient will be monitored for the desired effect, e.g. anti- 
infective effect. If inadequate vaccination is achieved then the patient can be boosted with 
further isolated polypeptide vaccinations and the vaccination parameters can be modified in 
a fashion expected to potentiate the immune response, e.g. increase the amount of antigen 
and/or adjuvant, complex the antigen with a carrier or conjugate it to an immunogenic 

35 protein, or vary the route of administration. 

For use of the isolated polypeptide as a vaccine, it is currently preferred that at least 
three separate inoculations with isolated polypeptide be administered, with a second 
inoculation being administered more than two, preferably three to eight, and more preferably 
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approximately four weeks following the first inoculation. It is preferred that a third 
inoculation be administered several months later than the second "boost" inoculation, 
preferably at least more than five months following the first inoculation, more preferably six 
months to two years following the first inoculation, and even more preferably eight months 
5 to one year following the first inoculation. Periodic inoculations beyond the third are also 
desirable to enhance the patient's "immune memory". See Anderson et aL, J. Infectious 
Diseases 160(6):960-969 (Dec. 1989). Generally, infrequent immunizations with isolated 
polypeptide spaced at relatively long intervals is more preferred than frequent immunizations 
in eliciting maximum antibody responses, and in eliciting a protective effect. 
10 The polypeptides of this invention may optionally be administered along with other 

pharmacologic agents used to treat AIDS or ARC or other HI V-related diseases and infections, 
such as AZT, CD4, antibiotics, immunomodulators such as interferon, anti-inflammatory 
- agents, and anti-tumor agents. 
Antibodies 

1 5 This invention is also directed to monoclonal antibodies. In accordance with this 

invention, monoclonal antibodies specifically binding an epitope of an isolated polypeptide or 
antigenically active fragments thereof are isolated from continuous hybrid cell lines formed 
by the fusion of antigen-primed immune lymphocytes with myeloma cells. The antibodies 
of the subject invention are obtained through routine screening. An assay is used for 
20 screening monoclonal antibodies for their cytotoxic potential as ricin A chain containing 
immunotoxins. The assay involves treating cells with dilutions of the test antibody followed 
by a Fab fragment of a secondary antibody coupled to ricin A chain ('indirect assay'). The 
cytotoxicity of the indirect assay is compared to that of the direct assay where the 
monoclonal antibody is coupled to ricin A chain. The indirect assay accurately predicts the 
25 potency of a given monoclonal antibody as an immunotoxin and is thus useful in screening 
monoclonal antibodies for use as immunotoxins - see also Vitetta et aL, Science 
- 238:1098-1 104 (1987), and Weltman et aL, Cancer Res. 47:5552 (1987). 

Monoclonal antibodies are highly specific, being directed against a single antigenic site. 
Furthermore, in contrast to conventional antibody (polyclonal) preparations which typically 
30 include different antibodies directed against different determinants (epitopes), each 
^ monoclonal antibody is directed against a single determinant on the antigen. Monoclonal 
antibodies are useful to improve the selectivity and specificity of diagnostic and analytical 
^ assay methods using antigen- antibody binding. A second advantage of monoclonal 
antibodies is that they are synthesized by the hybridoma culture, uncontaminated by other 
35 immunoglobulins. Monoclonal antibodies may be prepared from supernatants of cultured 
hybridoma cells or from ascites induced by intraperitoneal inoculation of hybridoma cells into 
mice. 
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The hybridoma technique described originally by Kohler and Milstein, Eur. J. Immunol., 
6:51 1 (1976) has been widely applied to produce hybrid cell lines that secret high levels of 
monoclonal antibodies against many specific antigens. 

In particular embodiments of this invention, an antibody is obtained by immunizing mice 
5 such as Balb/c or, preferably C57 BL/6, against an isolated polypeptide and screening for a 
clonal antibody that, when preincubated with the isolated polypeptide* prevents its binding 
to isolated polypeptide. Monoclonal antibodies may desirably have differences in affinity, 
immunoglobulin class, species of origin, or epitope; they may be antibodies which are 
expressed in recombinant cell culture or that are predetermined amino acid sequence variants 
10 of known antibodies, including chimeras of antibodies having a variable region directed 
against an isolated polypeptide, and a human constant region. 

The route and schedule of immunization of the host animal or cultured 
antibody-producing cells therefrom are generally in keeping with established and conventional 
techniques for antibody stimulation and production. Applicants typically have employed mice 
15 as the test model although it is contemplated that any mammalian subject including human 
subjects or antibody producing cells therefrom can be manipulated according to the processes 
of this invention to serve as the basis for production of mammalian, including human, hybrid 
cell lines. 

After immunization, immune lymphoid cells are fused with myeloma cells to generate 
20 a hybrid cell line which can be cultivated and subcultivated indefinitely, to produce large 
quantities of monoclonal antibodies. For purposes of this invention, the immune lymphoid 
cells selected for fusion are lymphocytes and their normal differentiated progeny, taken either 
from lymph node tissue or spleen tissue from immunized animals. Applicants prefer to 
employ immune spleen cells, since they offer a more concentrated and convenient source of 
25 antibody producing cells with respect to the mouse, system. The myeloma cells provide the 
basis for continuous propagation of the fused hybrid. Myeloma cells are tumor cells derived 
from plasma cells. 

It is possible to fuse cells of one species with another. However, it is preferred that 
the source of immunized antibody producing cells and myeloma be from the same species. 

30 Tne hybrid cell lines can be maintained, in culture in vitro in cell culture media. The cell 

lines of this invention can be selected and/pr maintained in a composition comprising the 
continuous pell line in hypoxanthine-aminopterin thymidine (HAT) medium. In fact, once the 
hybridoma cell line is established, it can be maintained on a variety of nutritionally adequate 
media. Moreover, the hybrid cell lines can be stored and preserved in any number of 

35 conventional ways, including freezing and storage under liquid nitrogen. Frozen cell lines can 
be revived and cultured indefinitely with resumed synthesis and secretion of monoclonal 
antibody. The secreted antibody is recovered from tissue culture supernatant by conventional 
methods such as precipitation. Ion exchange chromatography, affinity chromatography, or 
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tbe like. The antibodies described herein are also recovered from hybridoma ceil cultures by 
conventional methods for purification of IgG or IgM as the case may be that heretofore have 
been used to purify these immunoglobulins from pooled plasma, e.g. ethanol or polyethylene 
glycol precipitation procedures. The purified antibodies are sterile filtered, and optionally are 
5 conjugated to a detectable marker such as an enzyme or spin label for use in diagnostic 
assays of isolated polypeptide in test samples. 

While the invention covers using mouse monoclonal antibodies, the invention is not so 
limited; in fact, human antibodies may be used and may prove to be preferable. Such 
antibodies can be obtained by using human hybridomas (Cote et aL, Monoclonal Antibodies 
10 and Cancer Therapy, Alan R. Liss, p. 77 (1985)). In fact, according to the invention, 
techniques developed for the production of chimeric antibodies (Morrison et aL, Proc. Natl. 
Acad. Sci., 81:6851 (1984); Neuberger et a!., Nature 312:604 (1984); Takeda et aL, Nature 
314:452 (1985)) by splicing the genes from a mouse antibody molecule of appropriate 
antigen specificity together with genes from a human antibody molecule of appropriate 
15 biological activity (such as ability to activate human complement and mediate ADCC) can be 
used; such antibodies are within the scope of this invention. 

As another alternative to the cell fusion technique, EBV-immortafized B cells are used 
to produce the monoclonal antibodies of the subject Invention. Other methods for producing 
monoclonal antibodies such as recombinant DNA, are also contemplated. 
20 Immunotoxins 

This invention is also directed to immunochemical derivatives of the antibodies of this 
invention such as immunotoxins (conjugates of the antibody and a cytotoxic moiety). The 
antibodies are also used to induce lysis through the natural complement process, and to 
interact with antibody dependent cytotoxic cells normally present. 
25 Purified, sterile filtered antibodies are optionally conjugated to a cytotoxin such as ricin 

for use in AIDS therapy. EPO Publication 0 279 688 published 24 August 1988 illustrates 
methods for making and using immunotoxins for the treatment of HIV infection. 

Immunotoxins of this invention, capable of specifically binding regions of HIV env , are 
used to kill cells that are already infected and are actively producing new virus. Killing is 
30 accomplished by the binding of the immunotoxin to viral coat protein which is expressed on 
infected cells. The immunotoxin is then internalized and kills the cell. Infected cells that have 
incorporated viral genome into their DNA but are not synthesizing viral protein (i.e., cells in 
which the virus is latent) may not be susceptible to killing by immunotoxin until they begin 
to synthesize virus. The antibodies of this invention which span the clip site and/or the other 
35 antibodies described herein may be used alone or in any combination with for delivering toxins 
to infected cells. In addition, a toxin-antibody conjugate can bind to circulating viruses or 
viral coat protein which will then effect killing of cells that internalize virus or coat protein. 



WO 91/15512 



PCT/US91/02166 



-26- 

The subject invention provides a highly selective method of destroying HIV infected cells, 
utilizing the antibodies described herein. 

While not wishing to be constrained to any particular theory of operation of the 
invention, it is believed that the expression of the target antigen on the infected cell surface 
5 is transient. The antibodies must be capable of reaching the site on the cell surface vyhere 
the antigen resides and interacting with it. After the antibody complexes with the antigen, 
endocytosis takes place carrying the toxin into the cell. 

The irnmunotoxins of this invention are particularly helpful in killing 
monocytes/macrophages infected with the HIV virus. In contrast to the transient production 

10 of virus from T cells, macrophages produce high levels of virus for long periods of time. 
Current therapy is ineffective in inhibiting the production of new viruses in these cells. 

Not all monoclonal antibodies specific for an isolated polypeptide make highly cytotoxic 
irnmunotoxins, however assays are routinely and commonly used in the field to predict the 
ability of an antibody to function as part of a immunotoxin. Preferably the antibodies used 

15 cross react with several (or all) strains of HIV. 

The cytotoxic moiety of the immunotoxin may be a cytotoxic drug or an enzymatically 
active toxin of bacterial, fungal, plant or animal origin, or an enzymatically active fragment 
of such a toxin. Enzymatically active toxins and fragments thereof used are diphtheria A 
chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from Pseudomonas 

20 aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, Aleurites fordft 
proteins, dianthin proteins, Phytoiaca americana proteins (PAPI, PAPII, and PAP-S), 
momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonih, 
mitogellin, restrictocin, phenomycin, enomycin and the tricothecenes. In another 
embodiment, the antibodies are conjugated to small molecule anticancer drugs such as cis- 

25 platin or 5FU. Conjugates of the monoclonal antibody and such cytotoxic moieties are made 
using a variety of bit unctional protein coupling agents. Examples of such reagents are SPDP, 
IX, Afunctional derivatives of imidoesters such as dimethyl adipimidate HCI, active esters 
such as dtsuccinimidyl suberate, aldehydes such as glutaraldehyde, bis-azido compounds such 
as bis (p-azidobenzoyl) hexanediamrne, bis-diazonium derivatives such as bis- (p- 

30 diazoniumbenzoyl)- *ethylenediamine, diisocyanates such as tolylene 2,6-diisocyanate and 
bis-active fluorine compounds such as l,5-difluoro- 2,4-dinitrobenzene. The lysing portion of 
a toxin may be joined to the Fab fragment of the antibodies. 

Irnmunotoxins can be made in a variety of ways, as discussed herein. Commonly 
known crosslinking reagents can be used to yield stable conjugates. 

35 Advantageously, monoclonal antibodies specifically binding the domain of the protein 

which is exposed on the infected cell surface, are conjugated to ricin A chain. Most 
advantageously the ricin A chain is deglycosylated and produced through recombinant means. 
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An advantageous method of making the ricin immunotoxin is described in Viterta et al. t 
Science 238:1098 (1987). 

When used to kill infected human cells in vitro for diagnostic purposes, the conjugates 
will typically be added to the cell culture medium at a concentration of at least about 10 nM. 
5 The formulation and mode of administration for in vitro use are not critical. Aqueous 
formulations that are compatible with the culture or perfusion medium will normally be used. 
Cytotoxicity may be read by conventional techniques. 

Cytotoxic radiopharmaceuticals for treating infected cells may be made by conjugating 
radioactive isotopes (e.g. I f Y, Pr) to the antibodies. Advantageously alpha particle-emitting 
10 isotopes are used. The term 'cytotoxic moiety'' as used herein is intended to include such 
isotopes. 

In a preferred embodiment, ricin A chain is deglycosylated or produced without 
» oligosaccharides, to decrease its clearance by irrelevant clearance mechanisms (e.g., the 
liver). In another embodiment, whole ricin (A chain plus B chain) is conjugated to antibody 
15 if the galactose binding property of B-chain can be blocked ("blocked ricin"). 

In a further embodiment toxin-conjugates are made with Fab or F(ab') 2 fragments. 
Because of their relatively small size these fragments can better penetrate tissue to reach 
infected cells. 

In another embodiment, fusogenic liposomes are filled with a cytotoxic drug and the 
20 liposomes are coated with antibodies specifically binding HIV env . 
Antibody Dependent Cellular Cytotoxicity 

The present invention also involves a method based on the use of antibodies which are 
(a) directed against an isolated polypeptide, and (b) belong to a subclass or isotype that is 
capable of mediating the lysis of HIV virus infected cells to which the antibody molecule 
25 binds. More specifically, these antibodies should belong to a subclass or isotype that, upon 
r complexing with cell surface proteins, activates serum complement and/or mediates antibody 
dependent cellular cytotoxicity (ADCC) by activating effector cells such as natural killer cells 
or macrophages. 

The present invention is also directed to the use of these antibodies, in their native 
30 form * f o> AIDS therapy. For example, lgG2a and lgG3 mouse antibodies which bind 
^ HIV-associated cell surface antigens can be used in vitro for AIDS therapy. In fact, since HIV 
eny is present on infected monocytes and T-lymphocytes, the antibodies disclosed herein and 
*> their therapeutic use have general applicability. 

Biological activity of antibodies is known to be determined, to a large extent, by the 
35 Fc region of the antibody molecule (Uananue and Benacerraf, Textbook of Immunology, 2nd 
Edition, Williams & Wilkins, p. 218 (1984)). This includes their ability to activate complement 
and to mediate antibody-dependent cellular cytotoxicity (ADCC) as effected by leukocytes. 
Antibodies of different classes and subclasses differ in this respect, and, according to the 
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present invention, antibodies of those classes having the desired biological activity are 
selected. For example, mouse immunoglobulins of the lgG3 and lgG2a class are capable of 
activating serum complement upon binding to the target cells which express the cognate 
antigen. 

5 In general, antibodies of the lgG2a and lgG3 subclass and occasionally lgG1 can 

mediate ADCC, and antibodies of the lgG3, lgG2a, and IgM subclasses bind and activate 
serum complement. Complement activation generally requires the binding of at least two IgG 
molecules in close proximity on the target cell. However, the binding of only one IgM 
molecule activates serum complement. 

1 0 Tn © ability of any particular antibody to mediate lysis of the target cell by complement 

activation and/or ADCC can be assayed. The cells of interest are grown and labeled in vitro; 
the antibody is added to the cell culture in combination with either serum complement or 
immune cells which may be activated by the antigen antibody complexes. Cytolysis of th 
target cells is detected by the release of label from the lysed cells. In fact, antibodies can be 

1 5 screened using the patient's own serum as a source of complement and/or immune cells. The 
antibody that is capable of activating complement or mediating ADCC in the in vitro test can 
then be used therapeutically in that particular patient. 

Antibodies of virtually any origin can be used for this purpose provided they bind an 
isolated polypeptide epitope and can activate complement or mediate ADCC. Monoclonal 

20 antibodies offer the advantage of a continuous, ample supply. 
Therapeu tic and Other Uses of the Antibodies 

When used in vivo for therapy, the antibodies of the subject invention are administered 
to the patient in therapeutically effective amounts (i.e. amounts that restore T cell counts). 
They will normally be administered parenterally. The dose and dosage regimen will depend 

25 V. pon the de 9 ree of the infection, the characteristics of the particular immunotoxin (when 
used), e.g., its therapeutic index, the patient, and the patient's history. Advantageously the 
invriunotoxin is administered continuously over a period of 1-2 weeks, intravenously to treat 
cells in the vasculature and subcutaneously and intraperitoneal^ to treat, regional lymph 
nodes. Optionally, the administration is made during the course of adjunct therapy such as 

30 combined cycles of tumor necrosis factor and interferon or other immunomodulatory agent. 

For parenteral administration the antibodies will be formulated in a unit dosage 
injectable form (solution, suspension, emulsion) in association with a pharmaceutical^ 
acceptable parenteral vehicle. Such vehicles are inherently nontoxic^ and non-therapeutic. 
Examples of such vehicles are water, saline. Ringer's solution, dextrose solution, and 5% 

35 human serum albumin. Nonaqueous vehicles such as fixed oils and ethyl oleate can also be 
used. Liposomes may be used as carriers. The vehicle may contain minor amounts of 
additives such as substances that enhance isotonicity and chemical stability, e.g., buffers and 
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preservatives. The antibodies will typically be formulated in such vehicles at concentrations 
of about 1 mg/ml to 10 mg/ml. 

Use of IgM antibodies is not currently preferred, since the antigen is highly specific for 
the target ceils and rarely occurs on normal cells. IgG molecules by being smaller may be 
5 more able than IgM molecules to localize to infected cells. 

There is evidence that complement activation in vivo leads to a variety of biological 
effects, including the induction of an inflammatory response and the activation of 
macrophages (Uananue and Benecerraf, Textbook of immunology, 2nd Edition, Williams & 
Wilkins, p. 218 (1984)). The increased vasodilation accompanying inflammation may 
10 increase the ability of various anti-AIDS agents to localize in infected cells. Therefore, 
antigen-antibody combinations of the type specified by this invention can be used 
therapeutically in many ways. Additionally, purified antigens (Hakomori, Ann. Rev. Immunol. 
2:103 (1984)) or anti-idiotypic antibodies (Nepom et a/., Proc. Natl. Acad. Sci. 81:2864 
(1985); Koprowski et at., Proc, Natl. Acad. Sci. 81:216 (1984)) relating to such antigens 
1 5 could be used to induce an active immune response in human patients. Such a response 
includes the formation of antibodies capable of activating human complement and mediating 
ADCC and by such mechanisms cause infected cell destruction. 

The antibodies of the subject invention are also useful in the diagnosis of HIV in test 
samples. They are employed as one axis of a sandwich assay for an isolated polypeptide of 
20 HIV fny, together with a polyclonal or monoclonal antibody directed at another sterically-free 
/ epitope of HIV env. For use in some embodiments of sandwich assays the anti-isolated 
polypeptide antibody is bound to an insolubilizing support or is labelled with a detectable 
moiety following conventional procedures used with other monoclonal antibodies. In another 
embodiment a labelled antibody, e.g. labelled goat anti-murine IgG, capable of binding the 
25 anti-isolated polypeptide antibody is employed to detect the isolated polypeptide or HIV env 
binding using procedures previously known per se. 

The antibody compositions used in therapy are formulated and dosages established in 
a fashion consistent with good medical practice taking into account the disorder to be 
treated, the condition of the individual patient, the site of delivery of the composition, the 
30 method of administration and other factors known to practitioners. The antibody 
compositions are prepared for administration according to the description of preparation of 
polypeptides for administration, infra. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 
35 "Plasmids" are designated by a lower case p preceded and/or followed by capital letters 

and/or numbers. The starting plasmids herein are either commercially available, publicly 
available on an unrestricted basis, or can be constructed from available plasmids in accord 
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with published procedures. In addition, equivalent plasmids to those described are known in 
the art and will be apparent to the ordinarily skilled artisan. 

In particular, it is preferred that these plasmids have some or all of the following 
characteristics: (1) possess a minimal number of host-organism sequences; (2) be stable in 
5 the desired host; (3) be capable of being present in a high copy number in the desired host; 
(4) possess a regulatable promoter; and (5) have at least one DNA sequence coding for a 
selectable trait present on a portion of the plasmid separate from that where the novel DNA 
sequence will be inserted. Alteration of plasmids to meet the above criteria are easily 
performed by those of ordinary skill in the art in light of the available literature and the 
1 0 teachings herein. It is to be understood that additional cloning vectors may now exist or will 
be discovered which have the above-identified properties and are therefore suitable for use 
in the present invention and these vectors are also contemplated as being within the scope 
of this invention. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction enzyme 

1 5 that acts only at certain sequences in the DNA. The various restriction enzymes used herein 
are commercially available and their reaction conditions, cbfactors and other requirements 
were used as would be known to the ordinarily skilled artisan. For analytical purposes, 
typically 1 of plasmid or DNA fragment is used with about 2 units of enzyme in about 20 
j/l of buffer solution. For the purpose of isolating DNA fragments for plasmid construction, 

20 typically 5 to 50 //g of DNA are digested with 20 to 250 units of enzyme in a larger volume. 
Appropriate buffers and substrate amounts for particular restriction enzymes are specified by 
the manufacturer. Incubation times of about 1 hour at 37°C are ordinarily used, but may 
vary in accordance with the supplier's instructions. After digestion the reaction is 
electrophoresed directly on a polyacrylamide gel to isolate the desired fragment. 

25 - Size separation of the cleaved fragments is performed using 8 percent polyacrylamide 

gel described by Goeddel, D. et aL, Nucleic Acids Res. 8: 4057 (1980). 

"PCR" (polymerase chain reaction) refers to a technique whereby a piece of DNA is 
amplified. Oligonucleotide primers which correspond to the 3' and 5' ends (sense or 
antisense strand-check) of the segment of the DNA to be amplified are hybridized under 

30 appropriate conditions and the enzyme Taq polymerase, or equivalent enzyme, is used to 
synthesize copies of the DNA located between the primers. 

"Dephosphorylatiorr refers to the removal of the terminal 5 # phosphates by treatment 
with bacterial alkaline phosphatase (BAP). This procedure prevents the two restriction 
cleaved ends of a DNA fragment from "circularizing" or forming a closed loop that would 

35 impede insertion of another DNA fragment at the restriction site. Procedures and reagents 
for dephosphorylation are conventional. Maniatis, T. et ai.. Molecular Cloning pp. 1 33-1 34 
(1982). Reactions using BAP are carried out in 50mM Tris at 68°C to suppress the activity 
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of any xonucleases which are present in the enzyme preparations. Reactions are run for 1 
hour. Following the reaction the DNA fragment is gel purified. 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
complementary polydeoxynucleotide strands which may be chemically synthesized. Such 
5 synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two double 
stranded nucleic acid fragments (Maniatis, T. et at... Id,, p. 146). Unless otherwise provided, 
10 ligation is accomplished using known buffers and conditions with 10 units of T4 DMA ligase 
A ; ("ligase") per 0.5 jjq of approximately equimolar amounts of the DNA fragments to be ligated. 

"Filling" or "blunting" refers to the procedures by which the single stranded end in the 
cohesive terminus of a restriction enzyme-cleaved nucleic acid is converted to a double 
strand. This eliminates the cohesive terminus and forms a blunt end. This process is a 
1 5 versatile tool for converting a restriction cut end that may be cohesive with the ends created 
by only one or a few other restriction enzymes into a terminus compatible with any blunt- 
cutting restriction endonuclease or other filled cohesive terminus. Typically, blunting is 
accomplished by incubating 2-15 //g of the target DNA in 10mM MgCI 2 , ImM dithiothreitol, 
50mM NaCI, lOmM Tris (pH 7.5) buffer at about 37 °C in the presence of 8 units of the 
20 Klenow fragment of DNA polymerase I and 250 //M of each of the four deoxynucleoside 
triphosphates. The incubation generally is terminated after 30 min. phenol and chloroform 
extraction and ethanol precipitation. 

It is understood that the application of the teachings of the present invention to a 
specific problem or situation will be within the capabilities of one having ordinary skill in the 
25 art in light of the teachings contained herein. Examples of the products of the present 
s invention and representative processes for their isolation, use, and manufacture appear 
r below, but should not be construed to limit the invention. 

EXAMPLE 

We have been able to produce large- amounts of two different rgp120 fusion proteins 
30 in a mammalian cell system (Lasky et a/., 1 986). This has allowed us to elucidate all nine of 
the disulfide bonds, the positions of the glycosylation sites that are utilized and the type of 
oligosaccharide moiety present at each site in rgpl 20 from the lll B isolate of HIV-1 produced 
a- in CHO cells. 

This example describes the structural characterization of the recombinant envelope 
35 glycoprotein (rgp120) of human immunodeficiency virus type 1 produced by expression in 
Chinese hamster ovary cells. Enzymatic cleavage of rgpl 20 and reversed-phase high 
performance liquid chromatography were used to confirm the primary structure of the protein, 
to assign intrachain disulfide bonds and to characterize potential sites for N-glycosylation. 
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All of the tryptic peptides identified were consistent with the primary structure predicted from 
the cDNA sequence. Tryptic mapping studies combined with treatment of isolated peptides 
with S. aureus V8 protease or with peptide: N-glycosidase F (PNGase F) followed by 
ndoproteinase Asp-N permitted the assignment of all nine intrachain disulfide bonds of 
5 rgp120. The 24 potential sites for N-glycosylation were characterized by determining the 
susceptibilities of the attached carbohydrate structures to PNGase F and to 
endo-£-N-acetylglucosaminidase H. Tryptic mapping of enzymatically deglycosylated rgp! 20 
was used in conjunction with Edman degradation and fast atom bombardment-mass 
spectrometry of individually treated peptides to determine which of these sites are 
1 0 glycosylated and what types of structures are present. The results indicate that all 24 sites 
Q£ QP120 are utilized, including 13 that contain complex-type oligosaccharides as the 
predominant structures, and 1 1 that contain primarily high mannpse-type and/or hybrid-type 
oligosaccharide structures. 

For convenience, complete bibliographic references are given at the end of this 

15 . Example. 

EXPERIMENTAL PROCEDURES 

The abbreviations used throughout this example are: AAA, amino acid analysis; AIDS, 
acquired immunodeficiency syndrome; amu, atomic mass unit; CHO, Chinese hamster ovary; 
DTT, dithiothreitol; endo H, endo -£-N-acetylglucosaminidase H; FAB-MS, fast atom 

20 bombardment-mass spectrometry; gDI. herpes simplex type 1 glycoprotein D; gp, 
glycoprotein; HIV, human immunodeficiency virus; HPLC, high performance liquid 
chromatography; IAA, iodoacetic acid; PNGase F, peptide: N-glycosidase F; PTH, 
phenylthiohydantoin; RCM, reduced and S-carboxymethylated; rgp, recombinant glycoprotein; 
SIV, simian immunodeficiency virus; TFA, trif luoroacetic acid; TPCK, 

25 L-1-p-tosylamido-2-phenylethyl chloromethyl ketone. 

Materia/S" Recombinant gp120 proteins were produced in CHO c.ells and purified by 
immunoaffinity chromatography as previously described (Lasky at aL, 1 986). DTT, IAA, and 
2-acetamido-1-^-{L-aspartamido)-1,2-dideoxy-D-glucose (GlcNAc-Asn) were obtained from 
Sigma Chemical Company. HPLC/Spectro Grade trifluoroacetic acid (Pierce), Acetbnitrile UV 

30 (American B&J), and Milli Q w water (Millipore) were used for reversed-phase HPLC. The 
enzymes used were TPCK trypsin from Worthington Biomedical Corp., endoproteinase Asp-N 
("sequencing grade") obtained from Boehringer Mannheim GmbH, S. aureus V8 protease from 
ICN ImmunoBiologicals, and PNGase F (N-Glycanase w ) and endo H from Genzyme. 
Reduction and S-Carppxymethylat/on- Recombinant gp120 (2.0 mg of CL44 ISEQ. ID NO. 

35 12]) was dialyzed against 0.36 M Tris buffer, pH 8.6, containing 8 M urea and 3 mM EDTA. 
DTT was added to a concentration of 10 mM and the sample was incubated for 4 hours at 
ambient temperature. The sample was then treated with 25 mM IAA in the dark for 30 
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minutes at ambient temperature. The reaction was quenched with excess DTT, the sample 
was dialyzed against 0.1 M ammonium bicarbonate, and then lyophilized. 
Treatment of RCM rgp120 with PNGase F- RCM rgpl 20 (0.5 mg) was reconstituted in 0.1 
ml of 0.25 M sodium phosphate, pH 8.6, containing 10 mM EDTA and 0.02% NaN 3 to a 
5 concentration of 5 mg per mL. Tryptic peptides were reconstituted to the same molar 
concentration in 0.05 mM sodium phosphate, pH 7.0, containing 0.02% NaN 3 . PNGase F 
was added to the sample in the ratio of 12.5 units per mg of protein and the sample was 
incubated overnight at 37°C. RCM rgpl 20 treated with PNGase F was dialyzed against 0.1 
M ammonium bicarbonate. 
10 Treatment of RCM rgpl 20 with Endo H— RCM rgpl 20 (0.5 mg) was reconstituted in 0.1 ml 
of 0.05 M sodium phosphate, pH 6.0, containing 0.02% NaN 3 . Endo H (2 units/ml) was 
added to the sample in the ratio of 0.1 unit per mg of protein and the sample was incubated 
overnight at 37°C. RCM rgpl 20 treated with endo H was dialyzed against 0. 1 M ammonium 
bicarbonate. 

15 Treatment with TPCK-Trypsin- Samples of untreated, PNGase F-treated and endo H-treated 
RCM rgp120 {0.5 mg aliquots of CL44 [SEQ. ID NO. 121) in 0.1 M ammonium bicarbonate 
were treated at ambient temperature with TPCK-trypsin by the addition of aliquots of enzyme 
(enzyme to substrate ratio of 1 :100 w/w) at 0 and 6 hours of incubation. The digestion was 
stopped after 24 hours by freezing the samples. For disulfide determinations, a sample of 

20 rgp120 (0.5 mg of 9AA [SEQ. ID NO. 1 1]) was treated with TPCK-trypsin using the same 
conditions. 

Treatment of Tryptic Peptides with PNGase F Followed by Endoproteinase Asp~N~ Peptides 
(ranging from 0.5 nmol to 3.7 nmolj purified by reversed-phase HPLC of a 9AA tryptic digest 
were reconstituted in 0.05 M sodium phosphate, pH 7.0, containing 0.02% NaN 3 (0.05 ml). 
25 PNGase F (5 units in 0.06 ml of 0.05 M sodium phosphate, pH 7.0, containing 0,02% NaN3) 
was added and the samples were intubated for 20 hours at 37 °C. Endoproteinase Asp-N (2 
microgram) was then added and the samples were incubated for 20 hours at 37°C. 

Treatment of Tryptic Peptides with S. aureus V8 Protease- Peptides (3.0 nmol) purified by 
reversed-phase HPLC of a 9AA tryptic digest wiere reconstituted in 0.05 M sodium 
phosphate, pH 7?0, containing 0.02% NaN 3 (0.04 ml). V8 protease (5 microgram) was added 
at 0 and 7 hours and the sample was incubated for 24 hours at 37°C. 
Treatment of CL44 Peptides with Endo H Followed by PNGase F» Peptides (typically 3 nmol) 
purified by reversed-phase HPLC were reconstituted in 0.05 M sodium phosphate, pH 6.0, 
containing 0.02% NaN 3 (0.1 ml). Endo H (0.05 unit in 0.025 ml of 0.05 M sodium 
phosphate, pH 6.0, containing 0.02% NaN 3 ) was added and the sample was incubated for 
20 hours at 37°C. PNGase F (6.25 units) and 0.5 M sodium phosphate, pH 10.3, containing 



30 



35 
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0.02 M EDTA and 0.02% NaN 3 (0.125 ml) were then added and the sample was incubated 
for 20 hours at 37 °C. 

Reversed-phase HPLC— Tryptic digests were fractionated by re versed-phase HPLC on a 5 
micron Vydac C18 endcapped column (4.6 mm x 250 mm). After equilibration with 0.1% 
5 aqueous TFA, the elution of tryptic peptides was carried out at 1 ml per minute with a linear 
gradient from 0 to 45% acetonitrile containing 0.08% TFA in 90 minutes. The system used 
was a Waters gradient liquid chromatograph consisting of two 6000A pumps, a 720 
controller, and a WISP 71 0B injector, and a Perkin-Elmer LC75 single wavelength UV detector 
set at 214 nm. 

1 0 Peptides subjected to further manipulations were fractionated by reversed-phase HPLC 

on a Vydac CI 8 column (2.1 mm x 250 mm) equilibrated in 0. 1 % aqueous TFA at a flow rate 
of 0.2 ml per minute and a temperature of 40°C. These peptides were eluted with a linear 
gradient from 0 to 60% acetonitrile (containing 0.08% TFA) in 60 minutes. The system used 
was a Hewlett-Packard 1090M liquid chromatograph. 

1 5 Peptide identification- Peptides collected from reversed-phase HPLC were identified by AAA 
and/or N-terminal sequence analysis. Samples for AAA were treated with constant boiling 
HCI at 110°C in vacuo for either 24 or 72 hours, depending upon extent of glycosylation. 
The extended hydrolysis degrades glucosamine, which would otherwise interfere with 
quantitation of He and Leu. Analysis was performed on a Beckman Model 6300 amino acid 

20 analyzer with ninhydrin detection. 

N-terminal sequence analysis was performed on an Applied Biosystems Model 
477A/120A. The acetonitrile concentration in the equilibration buffer of the PTH analysis 
system was decreased from 10 to 9% to resolve the PTH derivative of GlcNAc-Asn from 
DTT. 

25 FAB-MS- FAB mass spectra were acquired on a JEOL HX1 10HF/HX1 10HF tandem mass 
spectrometer operated in a normal two-sector mode. FAB-MS was performed with 6 keV 
xenon atoms (10 mA emission current). Data were acquired oyer a mass range of 380-4000 
amu. 
RESULTS 

30 Lasky ef at (1 986) expressed gpl 20 in CHO cells as a fusion protein using the signal 

peptide of the herpes simplex gD1 . Two such fusion proteins were used in this study. The 
recombinant glycoprotein used in most of this study (CL44 [SEQ. ID NO. 12}) was expressed 
as a 498-amino-acid fusion protein containing the first 27 residues of gD1 fused to residues 
31-501 of gp120 (Lasky ef aL. 1986). This construction lacks the first cysteine residue of 

35 mature gp1 20. Disulfide assignments were carried out on another recombinant fusion protein 
(9AA ISEQ. ID NO. 11)) which contains the first 9 residues of gD1 fused to residues 4-501 
of gpl 20. This restores the first cysteine residue, Cys 24> Carboxy-terminal analysis of 
CL44 ISEQ. ID NO. 1 2] using carboxypeptidase digestions indicated that glutamic acid residue 
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479 is the carboxy terminus of the fully processed molecule secreted by CHO cells (data not 
shown). The amino acid sequences of these two constructions are given in Figure 1. 
RCM CL44 Tryptic Map- Reversed-phase HPLC tryptic mapping was used to confirm the 
primary structure of the molecule, to assign intrachain disulfide bonds and to characterize 
potential sites for N-glycosylation. In experiments not intended to give information about 
disulfides, the protein was RCM prior to digestion with trypsin. This treatment unfolds the 
protein and disrupts disulfide bonds, thereby resulting in smaller tryptic fragments than would 
be obtained with the native molecule. 

The reversed-phase HPLC tryptic map of RCM CL44 is shown in Figure 2. Tryptic 
peptides were separated by reversed-phase HPLC using an acetonitrile/water system with 
TFA as the ionic modifier. As will be discussed below, much of the peak heterogeneity 
derives from the extremely high (approximately 50% of total mass) carbohydrate content of 
the molecule. Peaks were collected and subjected to AAA for identification (Table I). In 
some cases, N-terminal sequence analysis was used for confirmation {these peaks are 
indicated in Table I). The peaks not assigned a label in Figure 2 were not identified. 
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AII of the peptides identified were consistent with the primary structure predicted from 
the cDNA sequence. Of the 38 predicted peptides with 3 or more amino acids, 36 were 
identified in the tryptic map of RCM CL44. In addition, 4 predicted peptides consisting of 2 
amino acids each were also identified {H3, H4, T23, and T35). The tripeptide composed of 
5 residues 139-141 (VQK) was not identified in the map and was not given a label in Figure 2. 
The only other peptide not identified was T13 (CNNK). Asparagine residue 200 of peptide 
T13 is a potential glycosylation site and the peptide lacks hydrophobic amino acids. 
Therefore, this glycopeptide is likely to be extremely hydrophilic and poorly resolved from the 
salt fraction on the reversed-phase column. 
10 Tryptic cleavage did not occur between peptides T5 and T6 and between peptides T8 

and T9. These are designated in Figure 2 as two T-numbers separated by a comma (T5,6 
and T8,9). The absence of cleavage was confirmed by N-terminal sequence analysis of the 
peptides. In both of these cases, the asparagine residue to the C-terminal side of the 
cleavage site is a potential N-glycosylation site and it is likely that the carbohydrate moiety 
1 5 interferes with the action of trypsin. Incomplete tryptic cleavage was also observed between 
peptides H4 and T2' and between peptides T23 and T24 <H4,T2' and T23,24). 

Several peptides arising from non-tryptic cleavages were observed in the tryptic map 
of RCM CL44. Two of the predicted tryptic peptides were further cleaved by 
"chymotrypsin-like" cleavages. Peptide T12 was completely cleaved after tyrosine residue 
20 1 87 and phenylalanine residue 1 93 to yield peptides T1 2a, T1 2b, and T1 2c. Peptide T4 was 
partially hydrolyzed after leucine residue 95 to yield peptides T4a and T4b. Intact peptide T4 
was also present. 

One of the tryptic peptides, T22 (QAHCNISR) [SEQ. ID NO. 14] eluted at two different 
positions (32.4 minutes and 34.1 minutes) in the RCM CL44 tryptic map. Deglycosylation 
25 studies {discussed below) with PNGase F and endo H indicated that the different retention 
times of the two forms of peptide T22 are not due to carbohydrate differences. It is possible 
that this retention time heterogeneity results from partial conversion of the N-terminal 
glutamine residue to pyroglutamic acid (Sanger and Thompson, 1 953). 

Disulfide Assignments in gpf 2 0~- Mature gp120 contains 18 cysteine residues (shaded in 
30 Figure 1) and therefore could contain 9 intrachain disulfide bonds. The CL44 [SEQ. ID NO. 

12] construction lacks Cys-24, the first cysteine residue of gp120 (Lasky et aL, 1986); 

therefore, a different construction (9AA [SEQ. ID NO. 1 1 ]), in which the first cysteine residue 

was restored, was expressed and purified to approximately the same degree as CL44 (L. 

Riddle, T. Gregory and D. Dowbenko, unpublished data). Ellman's reagent (EHman, 1959) 
35 was used to demonstrate the absence of free sulfhydryl groups in 9AA [SEQ. ID NO. 1 1] 

(data not shown). Therefore, disulfide assignments were determined for the 9AA 

construction. 
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Tryptic mapping studies performed without S-carboxymethylation of cysteine residues 
allowed partial assignment of disulfides. The tryptic map of 9AA is shown in Figure 3. Peaks 
were identified by N-terminal sequence analysis (Table II). These identifications allowed 
unequivocal assignment of three of the nin disulfide bonds: between Cys-1 01 and Cys-127 

5 (Peak A, Table II); between Cys-266 and Cys-301 (Peak B, Table II); and between Cys-24 and 
Cys-44 (Peak E, Table II). 

Peptides containing the remaining cysteine residues were also identified (Table II). 
Peptide T28 contains three cysteine residues and coelutes with peptide T31, which contains 
one cysteine residue (Peak D, Table II). Peptide T1 1 contains two cysteine residues and 

0 coelutes with peptides T3 and T4, each of which contains a single cysteine residue (Peak F, 
Table II). Similarly, peptide T14 contains two cysteine residues and coelutes with peptides 
T12 and T13, each of which has a single cysteine residue (Peaks C and E, Table II). In each 
of these cases more than one disulfide bond was present in the group of tryptic peptides, 
thereby preventing unambiguous assignment. These tryptic peptides were further 

5 manipulated as described below to introduce selective cleavage between cysteine residues 
located on a single peptide. 
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Table II. Identification of Cysteine-containing Peptides from the 
TrypticMa^ 9 A A, 

Cys-containing peaks from the tryptic map of 9AA were identified by 

N-terminal sequence analysis. Cysteines in boxes joined by a solid line 
represent disulfide bond assignments. Cysteines in boxes joined by dotted 
lines represent disulfide bonds that could not be assigned unambiguously in 
this experiment. Partial cleavages are indicated by a parenthesis. Cysteines 
are labelled by an amino acid number and peptides are labelled with T- 
numbers corresponding to the nomenclature used in Figure 1. 



Peak 


Cys-Containing Peptides 


A 


101 (T5.6) 
I f IT Pi 1 1/MnTMTMCCC^D 

(GE!K)NG3SFN1STSIR 

1 27 (T8.9) 


B 


266 (Ti6) 

tiivqlnqsvein[c3trpnnntr 

301 (T22) 


C 


(Tl2a,b) 188 

VSFEPIPIHYEDAPAGF 198 (Ti3j 
7^ ^ 0NNK 

/ ^ - - ;T" 

(Ti4a) L — — ^ ^ 

tfngtgpOtnvstvqOthgir 

209 217 


D 


(T26) 348 355 

qssggppe!vthsfn0ggeffy[^]nstqlfnstwfnstwste- 
-gsnntegsdtitlpIqJr^ - \ 

goo ( 13 • / 

ElSSNITGLLLTR 
415 


E" 


24 (T1) 

EATTTLF0ASDAK 
AYDTEVHNVWATHA [SVPTDPNPQE VVLVNVTENFNMWK 

44 (T2) 




(T12) 188 

VSFEPIPIHY[C]APAGFA!LK 198 (Ti3) 
^ 0NNK 

TFNGTGP[^NVSTVQlS]THGIRPVVSTQLLLNGSLAEEEVVIR 
209 217 


F 


(T3) 89 

NDMVEQMHEDIISLWDQSLKP[C]VK 95 (t*) 

^ ^ v LTPLjc]VSLK 
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Each of the peptides has a potential N-linked glycosylation sit located between the 
cysteine residues. The peptides were treated with PNGase F, which removes 
asparagine-linked carbohydrate while converting the attachment asparagine residue to 
aspartic acid (Tarentino et bL, 1985). The resulting aspartic acid residue serves as a point 
5 for selective cleavage of the peptides with endoproteinase Asp-N (Drapeau, 1980). The 
peptides were separated by reyersed-phase HPLG and identified by N-terminal sequence 
analysis. 

The HPLC chrorhatogram obtained after treatment of peptides T12, T13, and T14 
(Peak C, Figure 3) with PNGase Ffollovyed by endoproteinase Asp-N is given in Figure 4a, and 

10 the sequences of relevant peptides are given in Table III. The results indicate that rgpl 20 has 
disulfide bonds between Cys-198 and Cys-209 and between Cys-188 and Cys-217 (Table 
III). Treatment of peptides T3, T4, and Tl 1 (Peak F, Figure 3) with PNGase F followed by 
endoproteinase Asp-N allowed the recovery of fragments that demonstrated the presence of 
disulfide bonds between Cys-89 and Cys-175 and between Cys-96 and Cys-166 (Figure 4b 

15 and Table III) 
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Table III. Assignment of Disulfides from Peptides isolated in Figure 4. 

The tryptic peptides that could not be assigned unambiguously in 
Table II were further manipulated as described in Figure 4. Peaks 
were identified by N-terminal sequence analysis. 



Peak 



Sequence 



198 



HI 



DGTGPLQJT 

209 
188 

epipihy|c]apagf 
dvstvq!i3thg(ir) 

217 
89 

DQSLKPgVK 

dtsvitqaEDpk 

175 

96 



LTPL C VSLK 



DDTTSYTLTS 



166 



348 

ivthsfnIcQgge 



SSNITGLLLTR 



415 



355 

FFY[&NSTQLFNSTWFNSTWSTE 

titlp[q]R ' 

388 
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The last two disulfide bonds were assigned by treating peptides T28 arid T31 (Peak 
D, Figure 3) with V8 protease to cleave to the car boxy side of the glutamic acid and aspartic 
acid residues (Drapeau et at.. 1972) located between the cysteine residues of T28. The 
chromatogram obtained after V8 protease digestion of T28 and T31 is given in Figure 4c and 
5 the sequences of the relevant peptides are given in Table III. The results demonstrated the 
presence of disulfide bonds between Cys-348 and Cys-415 and between Cys-355 and 
Cys-388. 

Thus, the combined results of the tryptic mapping analysis and the further selective 
degradations permitted the assignment of all nine intrachain disulfide bonds of rgp120. 
1 0 Parallel experiments performed on CL44 [SEQ. ID NO. 1 2] produced similar results for the 8 
disulfide bonds remaining in that construction (not shown). The disulfide bond assignments 
of rgp120 are summarized in Figure 6. 

G/ycosy/ation Sites ofgp 120- Mature gp1 20 contains 24 potential sites for N-glycosylation, 
as recognized by the sequence: Asn-Xaa-Ser(Thr) (Kornfeld and Kornfeld, 1985). These sites 

15 are indicated by a dot above the corresponding asparagine residue in Figure 1A. In the 
present study, tryptic mapping of enzymatically deglycosylated CL44 [SEQ. ID NO. 12J was 
used in conjunction with Edman degradation and FAB-MS of individually treated peptides to 
determine which of the 24 potential N-glycosylation sites are glycosylated and which contain 
less fully processed (i.e. high mannose-type or hybrid-type) oligosaccharides. 

20 Tne two enzymes used for deglycosylation were PNGase F and endo H. PNGase F 

releases all types of N-linked oligosaccharide structures by cleavage of the 
0-aspartylglucosylamine linkage (Tarentino et si.. 1985). Endo H releases only high 
mannose-type and hybrid-type oligosaccharide structures by cleaving between the two core 
N-acetylglucosamine residues (Tai et aL. 1977). Deglycosylation of a peptide can be 

25 monitored by the increase in retention time of the peak corresponding to the glycopeptide In 
the reversed-phase elution profile. Thus, it was possible to determine which peptides were 
glycosylated by treatment with PNGase F and. on the basis of susceptibility to endo H, to 
distinguish those with attached high mannose-type and/or hybrid-type oligosaccharides as the 
predominant structures. 

30 The 24 potential glycosylation sites of CL44 [SEQ. ID NO. 12] are contained in 14 

tryptic glycopeptides. Thirteen of these glycopeptides were identified in the tryptic map of 
RCM CL44 (Figure 2). As mentioned above, T1 3 (CNNK) [SEQ. ID NO, 15) was not 
identified. The tryptic maps of PNGase F-treated RCM CL44 and endo H-treated RCM CL44 
are compared with the RCM CL44 tryptic map in Figure 5. The peaks corresponding to 

35 glycopeptides are labelled in each of the three tryptic maps. 

As would be expected for a heavily glycosylated molecule, treatment of RCM CL44 
with PNGas F (Figure 5b) simplified the tryptic map significantly. Typically, the peaks 
corresponding to potential glycopeptides in the RCM CL44 tryptic map (Figure 5a) were broad 
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and often appeared as muitiplets. Deglycosylation resulted in sharp, single peaks for each 
peptide, indicating that the glycopeptide peak multiplicity and broadness was due to 
carbohydrate heterogeneity. 

All of the 13 potential glycopeptides that had been identified in the tryptic map of RCM 
5 CL44 were shifted to later retention times in the tryptic map of PNGase F-treated material. 
This demonstrates that at least 13 of the 24 potential sites are glycosylated. Peptide T28 
was not recovered after deglycosylation. This peptide contains a large number of non-polar 
amino acids and, after removal of the hydrophilic carbohydrate moieties, may bind irreversibly 
to the HPLC column. As described above, peptide T22 elutes at 2 positions in the RCM CL44 
1 0 tryptic map presumably as a result of conversion of the N-terminal glutamine to pyroglutamic 
acid. The retention times of both of the T22 peaks were altered in the deglycosylated 
material produced by treatment with both PNGase F and endo H, confirming that the 
difference between these forms of peptide T22 in the RCM CL44 tryptic map was not due 
to carbohydrate heterogeneity. 
15 Th © tryptic map of endo H-treated RCM CL44 (Figure 5c) indicated that 6 of the 13 

tryptic glycopeptides were endo H-susceptible {peptides T14, T1 6, T22, T24, T28, and T31 ). 
In addition, a small amount of peptide T1 5 showed endo H susceptibility. For each of these 
glycopeptides, the efution time of the endo H-treated glycopeptide was earlier than that of 
the corresponding PNGase F-treated glycopeptide. This is due to the hydrophilic 
20 N-acetylglucosamine residue that remains attached to the asparagine residue following endo 
H treatment. Peptide T1 6 was not identified in the tryptic map of endo H-treated RCM CL44. 
This peptide contains 3 potential glycosylation sites and was poorly recovered under any 
circumstances. 

Conclusions as to the type of glycosylation present on each of the tryptic 
25 glycopeptides based on susceptibility to PNGase F and endo H are summarized in Table IV. 
Seven of the 13 glycopeptides identified in the tryptic map of RCM CL44 contain only a 
single glycosylation site and thus could be characterized unambiguously with regard to 
enzyme susceptibility. Peptides T2' (Asn-58), T26 (Asn-326), and T32 (Asn-433) were 
deglycosylated only by PNGase F and, therefore, contain attached complex-type 
30 oligosaccharide structures. Peptides T22 (Asn-302), T24 (Asn-309), and T31 (Asn-418) 
were susceptible to both PNGase F and endo H and, therefore, carry high mannose-type 
and/or hybrid-type oligosaccharide structures. Peptide T1 5 is only partially susceptible to 
endo H; therefore, Asn-246 carries primarily complex-type oligosaccharides but must also 
have some attached high mannose-type and/or hybrid-type oligosaccharide structures. 
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Table IV. Assignment of Clycvsylation Type to RCM CL44 Tryptic Peptides by Susceptibility to 
PNGase F and Endo H. 

Susceptibility to PNGase F or endo H was determined by an increase in the retention time of a 
peptide in the tryptic map of RCM CL44 PNGase F releases all types of N-linked 

oligosaccharide structures, whereas endo H releases only high mannose and hybrid 
oligosaccharide structures. 



Tryptic 


Glycosyiation Sites 


Susceptible 


Susceptible 


Glycosyiation 


Peptide a 


(Asn Residue #) 


To PNGase F 


To Endo H 


Type 


12* 


58 


Yes 


No 


Complex 


T6 


106,111 


Yes 


No 


Complex** 


T9 ^. 


126,130 


Yes 


No 


Complex** 


T11 


156,167 


Yes 


No 


Complex b 


T14 ' 


204.211,232 


Yes 


Yes 


High Mannose. Hybrid, and/or Complex 0 


T15 


246 


Yes 


Trace 


Complex (Trace High Mannose and/or Hybrid) 


T16 


259,265.271 


Yes 


Yes 


High Mannose. Hybrid, and/or Complex 0 


T22 


302 


Yes 


Yes 


High Mannose and/or Hybrid 


T24 


309 


Yes 


Yes 


High Mannose and/or Hybrid 


T26 


326 


Yes 


No 


Complex 


T28 


356.362,367.376 


Yes 


Yes 


High Mannose, Hybrid, and/or Complex 0 


T31 


418 


Yes 


Yes 


High Mannose and/or Hybrid 


T32 


433 


Yes 


No 


Complex 



a T13 not found. 

b Either or both sites glycosylated. 

c Endo'H susceptible glycosyiation at one or more site(s). 
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Peptides T6, T9, and Tl 1 each contain 2 potential glycosylation sites. Each peptide 
was deglycosylated by PNGase F but not by endo H indicating the presence of mostly 
complex-type oligosaccharide structures. In order to determine whether one or both of the 
potential glycosylation sites in each peptide were actually glycosylated, the PNGase F-treated 
5 glycopeptides were subjected to either FAB-MS or Edman degradation. Treatment with 
PNGase F converts the attachment asparagine residue to aspartic acid during deglycosylation 
(Tarentino et al., 1985). This conversion can be detected by FAB-MS as an increase of 1 
amu in the mass of the peptide for each site deglycosylated (Carr and Roberts, 1 986) or by 
Edman degradation by the appearance of the PTH derivative of aspartic acid at the 
10 appropriate cycles. FAB-MS of deglycosylated peptide T5,6 revealed an ion corresponding 
to the peptide mass plus 2 amu ([MHK observed: m/z 1772.6; calculated: m/z 1772.7). 
FAB-MS of deglycosylated peptide T9 gave similar results UMHP observed: m/z 1301.8; 
calculated: m/z 1301.5). Edman degradation was performed instead of FAB-MS on 
deglycosylated peptide Tl 1 because of its high molecular weight <>2000 amu). Aspartic 
1 5 acid was observed in cycles 8 (derived from Asn-1 56) and 1 9 (derived from Asn-1 67). These 
combined results indicate the presence of complex-type oligosaccharide structures attached 
to Asn residues 106, 111, 126, 130, 156, and 167. 

The remaining 3 glycopeptides identified in the tryptic map of RCM CL44 contained 
multiple potential glycosylation sites and were endo H-susceptible. Peptides T14, T16, and 
20 T28 account for a total of 10 potential glycosylation sites. Characterization of each 
glycosylation site was achieved by Edman degradation of HPLC-purified peptides that had 
been subjected to treatment with endo H followed by PNGase F. 

When endo H releases the high mannose-type and hybrid-type oligosaccharide 
structures, it leaves an N-acetylglucosamine residue attached to the asparagine residue of the 
25 peptide (Tarentino et al., 1 974). PNGase F will not remove this N-acetylglucosamine residue, 
but will release the remaining N-linked oligosaccharide structures by cleavage of the 
^-aspartylglucosylamine bond, resulting in conversion of the attachment asparagine residue 
to aspartic acid (Chu, 1986). Therefore, treatment with Endo H followed by PNGase F will 
yield asparagine at an unglycosylated site, GlcNAc-Asn at a glycosylation site that contained 
30 primarily high mannose-type and/or hybrid-type oligosaccharide structures, and aspartic acid 
at a glycosylation site that carried primarily complex-type oligosaccharide structures. Paxton 
et al. (1 987) have shown that it is possible to detect the PTH derivative of GlcNAc-Asn after 
Edman degradation. Using this approach, it was possible to characterize the remainder of the 
glycosylation sites of CL44 [SEG. ID NO. 12J. For example, treatment of glycopeptide T16, 
35 which contains 3 potential N-glycosylation sites, with endo H followed by PNGase F resulted 
in the appearance of the PTH derivative of GlcNAc-Asn at cycles 7 and 13 and the 
appearance of PTH-Asp at cycle 19 during Edman degradation. Thus, glycopeptide Tl 6 
carries primarily high mannose-type and/or hybrid-type oligosaccharides at Asn-259 and 
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Asn-265 and complex-type oligosaccharides at Asn-271. The results of these experiments 
are summarized- in Table V and indicate that CL44 [SEQ. ID NO. 12J contains complex-type 
oligosaccharide structures at Asn residues 271 , 367, and 376 and high mannose-type and/or 
hybrid-type oligosaccharide structures at Asn residues 204, 211, 232, 259, 265, 356, and 
5 362. 
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Table V. Assignment of Glycosylation Type to RCM CL44 Tryptic 
Glycopeptides Containing Multiple Potential Glycosylation 
Sites. 

Characterization of multiple potential glycosylation sites on RCM 
CL44 tryptic glycopeptides was achieved by Edman degradation of 
HPLC purified peptides subjected to treatment with endo H followed 
by PNGase F. Edman degradation of deglycosylated peptides shows- 
either an Asn residue at an unglycosylated site, a GlcNAc-Asn at a 
glycosylation site to which had been attached high mannose or hybrid 
oligosaccharide structures, or an Asp residue at a glycosylation site 
which had carried complex type oligosaccharide structures. 



Tryptic 


Asn 


Residue 




Peptide 


Residue # 


Observed 


Glycosylation Type 


T14 


204 


GlcNAc-Asn 


High Mannose and/or Hybrid 




211 


GlcNAc-Asn 


High Mannose and/or Hybrid 




232 


GlcNAc-Asn 


High Mannose and/or Hybrid 


T16 


259 


GlcNAc-Asn 


High Mannose and/or Hybrid 




265 


GicNAc-Asn 


High Mannose and/or Hybrid 




271 


Asp 


Complex 


T28 


356 


GlcNAc-Asn 


High Mannose and/or Hybrid 




362 


GlcNAc-Asn 


High Mannose and/or Hybrid 




367 


Asp 


Complex 




376 


Asp 


Complex 
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Peptide T13, which contains the remaining glycosylation site, was not identified in any 
of the tryptic maps pr sented in this paper. However, FAB-MS data obtained from the void 
peak f a tryptic map of RCM CL44 treated with endo H followed by PNGase F revealed an 
ion corresponding to MH* for that peptide containing an attached N-acetylgfucosamin 
5 residue (observed: m/z 740.1; calculated: m/z 740.4). The presence of peptide T13 in the 
void peak was further confirmed by AAA. Therefore, we conclude that Asn-200 is 
glycosylated and carries primarily high mannose-type and/or hybrid-type oligosaccharide 
structures. 

The data presented here demonstrate that all 24 potential glycosylation sites of gpl 20 
10 are utilized, that 13 sites contain primarily complex-type oligosaccharide structures while 1 1 
N^sites contain primarily hiQh mannose-type and/or hybrid-type oligosaccharide structures. The 
? type of glycosylation at each site is summarized in Figure 6. 

..- DISCUSSION 

We have determined the disulfide bonding pattern and the attachment positions of 
1 5 oligosaccharide moieties of rgpl 20 from the lll B isolate of HIV-1 . A schematic representation 
of this information is presented in Figure 6 [SEQ. ID NO. 10). The rgpT20 molecules from 
which the structural data were obtained possess the functional properties attributed to gpl 20 
produced by HIV-1 virions including high-affinity CD4 binding (Lasky era/., 1987), and HIV-1 
neutralizing antigenicity (Lasky etaL. 1986). We therefore conclude that the CHO-expressed 
20 gpl 20 is properly folded and that the disulfide-bonded domains reported here for the 
recombinant molecules are representative of those occurring in gp120 produced by HIV-1 
virions. 

Functional Aspects ofgp 720 Structure- The gp! 20 molecule comprises five disulfide-bonded 
loop structures. The first and fourth are simple loops formed by single disulfide bonds while 

25 ^ he second, third and fifth are more complex arrays of loops formed by nested disulfide 
^bonds. The fourth disulfide-bonded domain (residues 266-301) has been shown to contain 
significant type-specific neutralizing epitopes (Matsushita et aL, 1988; Rusche et aL, 1988; 
Goudsmit et aL, 1988; Javaherian et aL, 1989) and the fifth disulfide-bonded domain 
(residues 348-415) has been shown to be important for GD4 binding (Lasky et aL, 1987; 

30 Kowalski et aL, 1987). No direct functional correlates have been described for the other 
three disulfide-bonded domains. The amino acid sequence of gpl 20 varies to a large extent 
between different viral isolates but the majority of the variability is localized in hypervariable 
regions which punctuate the otherwise relatively conserved sequences (Willey et aL, 1 986; 
Modrdw et aL 1987). Modrow etaL (1987) have identified five hypervariable regions which 

35 are characterized by sequence variation, insertions and deletions. Four of these hypervariable 
regions correspond to well delineated loops as indicated in Figure 6. With the exception of 
the third hypervariable loop (disulfide-bonded domain IV) the functional significance of these 
regions is unknown. 
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The positions of the cysteine residues and, presumably, the disulfide bonding pattern 
in op120 are highly conserved between isolates. Among HIV-1 isolates, the only exception 
to this conservation is the Z3 isolate (Willey et al., 1 986) which has an additional pair of 
cysteine residues in the fourth hypervariable domain (residues 363-384). These residues 
5 most likely form a tenth disulfide bond in the gpl 20 from this isolate. The presence of this 
extra bond in such a hypervariable region probably has no more effect on the structure and 
function of the molecule than the other sequence variations that occur in that region. 

As shown in Fig. 7 in HIV-2 [SEQ. ID NO. 13], and similarly in SIV (data not shown) 
the positions of the cysteine residues in disulfide-bonded domains I, II, IV and V are 
10 conserved (Human Retroviruses and AIDS (1989). G. Myeres, A. Rabson, S. Josephs, T. 
Smith, J. Berzofsky and F. Wong-Stahl, Editors. U.S. Government Printing Office, Los Alamos 
National Laboratory, Los Alamos, New Mexico, LA-UR, 89-743). In domain III there are two 
additional pairs of cysteine residues (three in SIV isolate MM 142) which are presumed to be 
disulfide bonded within a finger-like domain III structure analogous to that illustrated in Figure 
15 6. Another major difference between HIV-1, HIV-2 and SIV is that hypervariable region V2 
is reduced to five amino acids in HIV-2 and SIV. The functional significance of the 
differences between HIV-1, HIV-2 and SIV is unknown at this time. 

One of the most important functions of gp1 20 is its ability to bind to CD4 and thereby 
mediate the attachment of virions to susceptible cells (Klatzman eta/., 1984; Dalgleish et al. f 
20 1984). The CD4-binding function has been localized by mutagenesis and structural studies 
(Lasky et al., 1987; Kowalski et a/., 1987) to the region between residues 320 and 450, 
which includes the fifth disulfide-bonded domain. Lasky et a/. (1987) showed that deletion 
of residues 396 to 407 and mutagenesis of Ala-402 to Asp abolished CD4 binding. They also 
mapped the epitope of a monoclonal antibody that blocks gpl 20-CD4 binding to residues 
25 392-402. Kowalski et a/. (1 987) identified three regions as being involved with CD4 binding. 
Insertions between residues 333-334, 388-390 and 442-443 abolished CD4 binding. In 
addition, a deletion of residues 441-479 abolished CD4 binding while deletion of residues 
362-369 within the fourth hypervariable region had no effect on binding. Cordonnier et al. 
(1989) have shown that mutagenesis of Trp-397 to Tyr or Phe decreases CD4 binding and 
30 changes to Ser, Gly, Val or Arg abolish binding. Nygren et al. (1988) have reported that a 
Proteolytic fragment of gp120 from residue 322 to near the C-terminus retains the ability to 
bind to CD4. The results of these studies indicate that the CD4 binding capacity of gpl 20 
is localized to the region between residues 320 and 450 and more specifically to the residues 
around 333-334, 442-443 and the sequence between 388 and 407. 
35 In the course of efforts to map the epitope of monoclonal antibody 5C2-E5 which 

blocks gp120-CD4 binding, Lasky et al. (1987) treated rgp120 (CL44 [SEQ. ID NO 12]) with 
acetic acid to cleave the protein at aspartic acid residues (Ingram, 1963) and isolated the 
peptide fragment 383-426 from a column of immobilized anti-gp120 monoclonal antibody 
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5C2-E5. Digestion of reduced rgp120 yielded the same fragment. Consequently, it was 
concluded that a disulfide bond existed between Cys residues 388 and 41 5. In the analysis 
reported here we have failed to find this disulfide bond and, instead, have consistently found 
the disulfide bonds between Cys-355 and Cys-388, and between Cys-348 and Cys-415 as 
5 summarized in Figure 6. We believe that the true disulfide-bond assignment is as indicated 
in Figure 6 and that the acetic acid digestion produced some disulfide bond rearrangement 
(Ryle and Sanger, 1 955) in the earlier work. 

The Oligosaccharides of gp120- Approximately 50% of the apparent molecular mass of 
gp120 is carbohydrate. The structures of the oligosaccharide moieties released by 

10 hydrazinolysis of CL44 [SEQ. ID NO. 

... tJ 2] rgpl 20 have been exhaustively analyzed (Mizuochi et a!., 1 988a; Mizuochi ef aL, 1 988b). 
^These authors found that 33% of the N-linked oligosaccharides were of the high 
mannose-type, 4% were of the hybrid type, and 63% were of the complex type. Of the 
complex oligosaccharides 90% were fucosylated and 94% were sialylated. The complex 

1 5 structures were approximately 4% monoantennary, 61 % biantennary, 19% triantennary and 
16% tetraantennary. No O-linked oligosaccharides were found. Geyer et aL (1988) have 
analyzed the oligosaccharides of gp!20 from the lll B isolate of H I V-1 -infected human cells. 
They found that high mannose-type oligosaccharides accounted for approximately 50% of 
the carbohydrate structures. The remaining structures were fucosylated, partially sialylated 

20 bi-, tri-, and tetraantennary complex-type oligosaccharides. No novel carbohydrate structures, 
or moieties that would be expected to act as heterophile antigens in man, have been isolated 
from gp! 20 from either source. 

We have shown here that all 24 glycosylation sites are utilized, and that 13 of the 24 
sites contain complex-type oligosaccharides as the predominant structures while 1 1 contain 

25 primarily hybrid and/or high mannose structures. The demonstration of endo H-susceptible 
structures at 11 of the 24 sites is consistent with the earlier results of Mizuochi et aL 
(1988a, 1988b) who determined that nearly 40% of the total oligosaccharide structures 
released from rgp120 were hybrid and/or high mannose-type oligosaccharides. 

The 24 potential N-linked glycosylation sites in the gp1 20 sequence are conserved to 

30 a large extent between different viral isolates (Willey et aL, 1986; Modrow et aL, 1987) 
Based on the gp120 sequence comparisons in these references, 13 of the sites on gpl20 
from the lll B isolate of HjV-1 are absolutely conserved; these include 8 of the 11 sites that 
carry predominantly hybrid-type and/or high mannose-type oligosaccharides. Thus, the less 
fully processed (i.e. Ehdo H-susceptible) oligosaccharides of gp 120 are found preferentially 

35 at the most conserved glycosylation sites. The remaining sites (8 complex and 3 hybrid/high 
mannose) are relatively conserved, even though many of them ccur in the hypervariable 
regi ns. The positions of these sites may shift or be deleted, but there is always ne or more 
new site(s) within 5 t 10 residues f the reference lllg site. Studies by Willey ef aL (1988) 
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demonstrated that mutagenesis of Asn-232 to Gin decreased the infectivity of virions 
containing the mutant gp120 molecules without affecting CD4 binding or syncytium 
formation. At this time, no particular functional significance can be attributed to the type of 
oligosaccharide structure at any of the sites. 
5 The role of the carbohydrate moieties on gp120 in CD4 binding has been investigated 

by several authors (Lifson et at., 1986; Matthews et al. f 1987; Fenouillet et a/,, 1989). 
Those that employed enzymatic deglycosylation in the presence of detergents (Lifson et el., 
1986; Matthews et al. t 1987) have concluded that the carbohydrates are not directly 
involved with the binding, but that they are required to maintain the conformation of gp120 

10 necessary for binding. In contrast, Fenouillet et af. (1989) enzymatically deglycosylated 
gp120 without detergent and demonstrated that the CD4 binding affinity was preserved. It 
therefore appears that the carbohydrate moieties of gpl 20 are not required for its binding to 
CD4 but that the conformational stability of gp1 20 to detergents is lost after deglycosylation. 
The rgp120 used for these determinations is functionally and structurally equivalent 

15 to gp120 produced by HIV-1 infected cells. 
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SEQUENCE LISTING 

<1) GENERAL INFORMATION: 

(i) APPLICANT: Genentech, Inc. 

5 

(ii) TITLE OF INVENTION: HIV Envelope Polypeptides 
(iii) NUMBER OF SEQUENCES: 15 

10 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genentech, Inc. 

(B) STREET: 460 Point San Bruno Blvd 

(C) CITY: South San Francisco 

(D) STATE: California 
15 (E) COUNTRY: USA 

(F) ZIP: 94080 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 5.25 inch, 360 Kb floppy disk 
20 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS— EjOiS 

(D) SOFTWARE: pat in (Genentech) 

(vi) CURRENT APPLICATION DATA: 
25 (A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 
30 (A) APPLICATION NUMBER: U.S. S.N. 07/504,772 

(B) FILING DATE: 03-APRIL-1990 

(viii) ATTORNEY / AGENT INFORMATION: 
(A) NAME: Adler, Carolyn R. 
35 (B) REGISTRATION NUMBER: 32,324 

(C) REFERENCE/DOCKET NUMBER: 639 

<ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 415/266-2614 
40 (B) TELEFAX: 415/952-9881 

- (C) TELEX: 910/371-7168 

(2) INFORMATION FOR SEQ ID NO:l: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NOM: 

Cys Val Lys Leu Thr Pro Leu Cys Cys Asn Thr Ser Val lie Thr 
1 5 10 15 

55 Gin Ala Cys 

18 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Pro lie Hie Tyr Cys Ala Pro Ala Gly Phe Ala lie Leu Lys Cys 
1 5 10 15 

Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys Thr Asn Val Ser 
20 25 30 

Thr Val Gin Cys Thr His Gly lie Arg Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

Cys Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys 
1 5 10 12 

(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 amino acids 
35 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Cys Ala Pro Ala Gly Phe Ala He Leu Lys Cys Cys Thr Asn Val 
1 5 10 15 

Ser Thr Val Gin Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 5: 



( i ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 12 amino acids 
50 (B) TYPE: amino acid 

( D ) TOPOLOGY : 1 inear 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Pro He His Tyr Cys Cys Thr His Gly lie Arg Pro 
1 5 10 12 
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(2) INFORMATION FOR SEQ ID NO:6i 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Gly Gly Asp Pro Glu lie Val Thr His Ser Phe Aon Cys Gly Gly 
1 5 10 15 

Glu Phe Phe Tyr Cys Aon Ser Leu Pro Cys Arg lie Lye Gin Phe 
20 25 30 

lie Aan Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro 
35 40 45 

lie Ser Gly Gin lie Arg Cys Ser Ser Asn lie Thr Gly 
20 50 55 58 

(2) INFORMATION FOR SEQ ID NO: 7: 

<i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Cys Gly Gly Glu Phe Phe Tyr Cys Cys Arg lie Lys Gin Phe lie 
1 5 10 15 



Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro lie 
35 20 25 30 

Ser Gly Gin He Arg Cys 

>; 35 36 

40 (2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val His Asn Val 
50 * 5 10 15 

Trp Ala Thr His Ala Cys 
20 21 

55 (2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr 
15 io 15 

Glu Val His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp 
20 25 30 



Pro Asn 
15 32 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 479 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Thr Glu Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp 
1 * 10 15 



Lys Glu Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala 
30 20 25 30 

Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 
35 40 45 

35 Pro Thr Asp Pro Asn Pro Gin Glu Val Val Leu Val Asn Val Thr 

50 55 60 

Glu Asn Phe Asn Met Trp Lys Asn Asp Met Val Glu Gin Met His 
65 70 75 

Glu Asp lie lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Val 
80 85 90 

Lys Leu Thr Pro Leu Cys Val Ser Leu Lys Cys Thr Asp Leu Lys 
45 95 100 105 

Asn Asp Thr Asn Thr Asn Ser Ser Ser Gly Arg Met lie Met Glu 
110 115 120 

50 Lys Gly Glu He Lys Asn Cys Ser Phe Asn He Ser Thr Ser He 

125 130 135 



Arg Gly Lys Val Gin Lys Glu Tyr Ala Phe Phe Tyr Lys Leu Asp 
140 145 150 

He He Pro He Asp Asn Asp Thr Thr Ser Tyr Thr Leu Thr Ser 
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155 160 165 

Cys Asn Thr Ser Val lie Thr Gin Ala Cys Pro Lye Val Ser Phe 

170 175 180 

5 

Glu Pro lie Pro lie His Tyr Cye Ala Pro Ala Gly Phe Ala lie 

185 190 195 

Leu Lye Cys Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys Thr 
10 200 205 210 

Asn Val Ser Thr Val Gin Cys Thr His Gly lie Arg Pro Val Val 
215 220 225 

15 Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Val 

230 235 240 



20 



35 



50 



Val lie Arg Ser Ala Asn Phe Thr Asp Asn Ala . Lys Thr lie lie 
245 250 255 

Val Gin Leu Asn Gin Ser Val Glu lie Asn Cys Thr Arg Pro Asn 
260 265 270 



Asn Asn Thr Arg Lys Ser lie Arg He Gin Arg Gly Pro Gly Arg 

25 275 280 285 

Ala Phe Val Thr He Gly Lys He Gly Asn Met Arg Gin Ala His 

290 295 300 

30 Cys Asn lie Ser Arg Ala Lys Trp Asn Asn Thr Leu Lys Gin lie 

305 310 315 

Asp Ser Lys Leu Arg Glu Gin Phe Gly Asn Asn Lys Thr lie lie 

320 325 330 

Phe Lys Gin Ser Ser Gly Gly Asp Pro Glu lie Val Thr His Ser 

335 340 345 

Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys Asn Ser Thr Gin Leu 

40 350 355 360 

Phe Asn Ser Thr Trp Phe Asn Ser Thr Trp Ser Thr Glu Gly Ser 

365 370 375 

45 Asn Asn Thr Glu Gly Ser Asp Thr lie Thr Leu Pro Cys Arg He 

380 385 390 

Lys Gin Phe He Ash Met Trp Gin Glu Val Gly Lys Ala Met Tyr 

395 400 405 

Ala Pro Pro He Ser Gly Gin He Arg Cys Ser Ser Asn lie Thr 

410 415 420 

Gly Leu Leu Leu Thr Arg Asp Gly Gly Asn Asn Asn Asn Glu Ser 

55 425 430 435 
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Glu lie Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg 
440 445 450 

Ser Glu Leu Tyr Lys Tyr Lys Val Val Lys lie Glu Pro Leu Gly 
5 455 460 465 

Val Ala Pro Thr Lye Ala Lye Arg Arg Val Val Gin Arg Glu 
470 475 479 

10 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 
<B) TYPE: amino acid 

15 (D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Lys Tyr Ala Leu Ala Asp Ala Ser Leu 
20 1 5 9 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 27 amino acidB 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



30 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

Lys Tyr Ala Leu Ala Asp Ala Ser Leu Lys Met Ala Asp Pro Asn 
1 5 10 15 



Arg Phe Arg Gly Lys Asp Leu Pro Val Leu Asp Gin 
35 20 25 27 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Thr Gin Tyr Val Thr Val Phe Tyr Gly Val Pro Thr Trp Lys Asn 
1 5 10 15 



Ala Thr lie Pro Leu Phe Cys Ala Thr Arg Asn Arg Asp Thr Trp 
50 20 25 30 

Gly Thr lie Gin Cys Leu Pro Asp Asn Asp Asp Tyr Gin Glu lie 
35 40 45 



55 



Thr Leu Asn Val Thr Glu Ala Phe Asp Ala Trp Asn Asn Thr Val 
50 55 60 
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Thr Glu Gin Ala lie Glu Asp Val Trp Hie Leu Phe Glu Thr Ser 
65 70 75 

lie Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Ala Met Lys 
80 85 90 

Cys Ser Ser Thr Glu Ser Ser Thr Gly Asn Asn Thr Thr Ser Lys 
95 100 105 

Ser Thr Ser Thr Thr Thr Thr Thr Pro Thr Asp Gin Glu Gin Glu 
110 us 120 

lie Ser Glu Asp Thr Pro Cys Ala Arg Ala Asp Asn Cys Ser Gly 
125 130 135 

^.Leu Gly Glu Glu Glu Thr lie Asn Cys Gin Phe Asn Met Thr Gly 

140 145 150 



Leu Glu Arg Asp Lys Lys Lys Gin Tyr Asn Glu Thr Trp Tyr Ser 
20 155 160 165 

Lys Asp Val Val Cys Glu Thr Asn Asn Ser Thr Asn Gin Thr Gin 
1 ?0 175 180 

25 Cys Tyr Met Asn His Cys Asn Thr Ser Val lie Thr Glu Ser Cys 

185 190 195 



Asp Lys His Tyr Trp Asp Ala lie Arg Phe Arg Tyr Cys Ala Pro 

200 205 210 

Pro Gly Tyr Ala Leu Leu Arg Cys Ash Asp Thr Asn Tyr Ser Gly 

215 220 225 

Phe Ala Pro Asn Cys Ser Lys Val Val Ala Ser Thr Cys Thr Arg 

230 235 240 

Met Met Glu Thr Gin Thr Ser Thr Trp Phe Gly Phe Asn Gly Thr 

245 250 255 

Arg Ala Glu Asn Arg Thr Tyr He Tyr Trp His Gly Arg Asp Asn 

260 265 270 

Arg Thr He lie Ser Leu Asn Lys Tyr Tyr Asn Leu Ser Leu His 

275 280 285 

Cys Lys Arg Pro Gly Asn Lys He Val Lys Gin lie Met Leu Met 

290 295 300 



Ser Gly His Val Phe His Ser His Gin Pro He Asn Lys Arg Pro 
50 305 310 315 

Arg Gin Ala Trp Cys Trp Phe Lys Gly Lys Trp Lys Asp Ala Met 
320 325 330 



55 



Gin Glu Val Lys Glu Thr Leu Ala Lys His Pro Arg Tyr Arg Gly 
335 340 345 
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Thr Asn Asp Thr Arg Asn lie Ser Phe Ala Ala Pro Gly Lys Gly 

350 355 360 

Ser Asp Pro Glu Val Ala Tyr Met Trp Thr Asn Cys Arg Gly Glu 

5 365 370 375 

Phe Leu Tyr CyB Asn Met Thr Trp Phe Leu Asn Trp lie Glu Asn 

380 385 390 

10 Lys Thr His Arg Asn Tyr Ala Pro Cys His lie LyB Gin lie lie 

395 400 405 



15 



30 



Asn Thr Trp His Lys Val Gly Arg Asn Val Tyr Leu Pro Pro Arg 

410 415 420 

Glu Gly Glu Leu Ser Cys Asn Ser Thr Val Thr Ser lie lie Ala 

425 430 435 



Asn lie A Bp Trp Gin ABn Asn Asn Gin Thr Asn He Thr Phe Ser 
20 440 445 450 

Ala Glu Val Ala Glu Leu Tyr Arg Leu Glu Leu Gly Asp Tyr Lys 
455 460 465 

25 Leu Val Glu He Thr Pro He Gly Phe Ala Pro Thr Lys Glu Lys 

470 475 480 



Arg 
481 

(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 8 amino acids 
35 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

40 Gin Ala His Cys Asn lie Ser Arg 

1 5 6 

(2) INFORMATION FOR SEQ ID NO: 15: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Cys Asn Asn Lys 
1 4 



55 
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Clatms 

We claim: 

1 . An isolated cyclized polypeptide sequence comprising the amino acid residues selected 
from the group consisting of: 

5 a) CVKLTPLCCNTSVITQAC [SEQ. ID NO. 1 J and containing less than 

about 28 amino acid residues; 

b) PIHYCAPAGFAILKCNNKTFNG T G PCTN VSTVQCTHG 
I R P [SEQ. ID NO. 2] and containing less than about 45 amino acid residues; 

c) CNNKTFNGTGPC [SEQ. ID NO. 3] and containing less than about 22 
1 0 amino acid residues; 

r{ . d) CAPAGFAILKC C T N VST VQC [SEQ. ID NO. 4] and containing less 
than about 30 amino acid residues; 
e) PIHYCCTHGIRP [SEQ. ID NO. 5] and containing less than about 22 
amino acid residues; 

15 « GGDPEIVTHSFNCGGEFFYCNSLPCRIKQFINMWQEVG 

KAMYAPPISGQIRCSSNITG [SEQ. ID NO. 6) and containing less 
than about 65 amino acid residues; 

g) CGGEFFYCCRIKQFINMWGEVGKAMYAPPISGQIRC 
[SEQ. ID NO. 7] and containing less than about 45 amino acid residues; 
20 h > CASDAKAY DT EVHNVWATHAC [SEQ. ID NO. 8] and containing 

less than about 30 amino acid residues; and 
i) TTTLFCASDAKAY DT EVHN VWATH AC V P T D P N [SEQ. ID 
NO. 9] and containing less than about 50 amino acid residues. 

2. A method for the prophylaxis or treatment of HIV infection comprising administering 
25 « a thera Peutically effective dose of a sterile composition comprising the cyclized peptide 

of claim 1 and an pharmaceutical^ acceptable vehicle to a patient having or at risk of 
having HIV infection. 

3. The method of claim 2 wherein the therapeutic dose is about from 0.5 x 1 0* to 5 x 1.0* 
molar. 

30 4. The method of claim 2 wherein the composition further contains an adjuvant. 

5. An antibody which is directed to an antigenic determinant comprised by the isolated 
cyclized polypeptide of claim 1. 

6. The antibody of claim 5 which is conjugated to a cytotoxin. 

7. The antibody of claim 5 which is covalently bound to a detectable marker or a water- 
35 insoluble matrix. 

8. The antibody of claim 5 in a sterile, pharmaceutically acceptable vehicle. 
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An isolated polypeptide having an antigenic determinant or determinants 
immunologically cross-reactive with a determinant of an HIV env polypeptide having 
an amino acid sequence selected from the group consisting of 

a) residues 1-80; 

b) residues 8-180; 

c) residues 165-260; 

d) residues 160-260; 

e) residues 260-310; and 

f) residues 320-479. 

An antibody directed to an isolated polypeptide having an antigenic determinant or 
determinants immunologically cross-reactive with a determinant of the HIV env 
polypeptide of strain HTLV-HIB having an amino acid sequence selected from the group 
consisting of: 
a) residues 1-80; 
!5 b) residues 8-180; 

c) residues 165-260; 

d) residues 160-260; 

e) residues 260-310; and 

f) residues 320-479. 

20 11. The antibody of claim 10 which is conjugated to a cytotoxin. 

1 2. The antibody of claim 10 which is covalently bound to a detectable marker or a water- 
insoluble matrix. 

13. The antibody of claim 10 in a sterile, pharmaceutical^ acceptable vehicle. 

14. A method for the prophylaxis or treatment of HIV infection comprising administering 
25 a therapeutically effective dose of a sterile composition comprising the antibody of 

claim 5 and an pharmaceutical^ acceptable vehicle to a patient having or at risk of 
having HIV infection. 

15. The method of claim 14, wherein said antibody is conjugated to a cytotoxin. 

16. A method for the prophylaxis or treatment of HIV infection comprising administering 
30 3 therapeutically effective dose of a sterile composition comprising the antibody of 

claim 1 0 and an pharmaceutical^ acceptable vehicle to a patient having or at risk of 
having HIV infection. 

17. The method of claim 16, wherein said antibody is conjugated to a cytotoxin! 
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